IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re application of: 



Examiner: Hayes, Robert Clinton 



Kevin P. BAKER, et al. 



Art Unit: 1649 



Application Serial No. 10/015,499 



Confirmation No. 6886 



Filed: December 11,2001 



Attorney's Docket No. GNE-2830 P1C42 



For: PR01788 POLYPEPTIDES 



Customer No. 77845 



FILED VIA EXPRESS MAIL NO. EM 305 114 229 US - DECEMBER 5 , 2008 
ON APPEAL TO THE BOARD OF PATENT APPEALS AND INTERFERENCES 



MAIL STOP APPEAL BRIEF - PATENTS 

Commissioner for Patents 
P.O. Box 1450 

Alexandria, Virginia 22313-1450 
Dear Sir: 

This Appeal Brief, filed in connection with the above captioned patent application, is 
responsive to the Final Office Actions mailed on March 10, 2008. A Notice of Appeal was filed 
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I. REAL PARTY IN INTEREST 

The real party in interest is Genentech, Inc., South San Francisco, California, by an 
assignment of the patent application U.S. Serial No. 09/946,374 recorded January 8, 2002, at 
Reel 012288 and Frame 0504. 

II. RELATED APPEALS AND INTERFERENCES 

The claims pending in the current application are directed to a polypeptide referred to 
herein as "PR01788". There exist two related patent applications, (1) U.S. Serial No. 
10/017,390, filed December 13, 2001 (containing claims directed to polynucleotides encoding 
PRO 1 788 polypeptides), and (2) U.S. Serial No. 10/015,653, filed December 1 1, 2001 
(containing claims directed to antibodies that bind PRO 1788 polypeptides). The 10/017,390 
application is still pending. The 10/015,653 application is also under final rejection from the 
same Examiner and based upon the same outstanding rejection, and appeal of this final rejection 
is being pursued independently and concurrently herewith. 

III. STATUS OF CLAIMS 

Claims 28-35 and 38-40 are in this application. 
Claims 1-27 and 36-37 are canceled. 

Claims 28-35 and 38-40 stand rejected and Appellants appeal the rejection of these 

claims. 

A copy of the rejected claims involved in the present Appeal is provided in the Claims 
Appendix. 

IV. STATUS OF AMENDMENTS 

A summary of the prosecution history for this case is as follows: 

On March 10, 2008, a final Office action mailed from the USPTO. A Notice of Appeal 

was filed on August 8, 2008. 

Claims 28-35 have been amended in a supplemental amendment/response to the final 

Office Action of March 10, 2008 filed concurrently with the present appeal. A copy of the 

rejected claims in the present Appeal is provided in the Claims Appendix, incorporating the 

amendment (Section VIII). 
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V. SUMMARY OF CLAIMED SUBJECT MATTER 

The invention claimed in the present application is related to an isolated polypeptide 
comprising the amino acid sequence of the polypeptide of SEQ ID NO:397 (Claims 33a and 34); 
the amino acid sequence of the polypeptide-of SEQ ID NO:397, lacking its associated signal 
peptide (Claims 33b and 35); or the amino acid sequence of the polypeptide encoded by the full- 
length coding sequence of the cDNA deposited under ATCC accession number 203480 (Claims 
33c and 38). The invention is further directed to polypeptides having at least 80% (Claim 28), 
85% (Claim 29), 90% (Claim 30), 95% (Claim 31), or 99% (Claim 32) amino acid sequence 
identity to the amino acid sequence of the polypeptide of SEQ ID NO:397; the amino acid 
sequence of the polypeptide-of SEQ ID NO:397, lacking its associated signal peptide; or the 
amino acid sequence of the polypeptide encoded by the full-length coding sequence of the cDNA 
deposited under ATCC accession number 203480, wherein the nucleic acid encoding said 
polypeptide is amplified in colon tumors. The invention is further directed to a chimeric 
polypeptide comprising one of the above polypeptides fused to a heterologous polypeptide 
(Claim 39), and to a chimeric polypeptide wherein the heterologous polypeptide is an epitope tag 
or an Fc region of an immunoglobulin (Claim 40). 

The full-length PR01788 polypeptide having the amino acid sequence of SEQ ID 
NO:397 is described in the specification at, for example, page 32, line 36 to page 33, line 37, 
page 264, line 32 to page 266, line 34, page 353, lines 17-23, Example 119, in Figure 232 and in 
SEQ ID NO:397. The cDNA nucleic acid encoding PRO 1788 is described in the specification 
at, for example, Example 1 19, in Figure 231 and in SEQ ID NO:396. Page 299, lines 13-17 of 
the specification provides the description for Figures 231 and 232. PRO polypeptide variants 
having at least about 80% amino acid sequence identity with a full length PRO polypeptide 
sequence or a PRO polypeptide sequence lacking the signal peptide are described in the 
specification at, for example, page 302, lines 4-26. The preparation of chimeric PRO 
polypeptides, including those wherein the heterologous polypeptide is an epitope tag or an Fc 
region of an immunoglobulin, is set forth in the specification at page 358, lines 1 1-34. Examples 
128-131 describe the expression of PRO polypeptides in various host cells, including E. coli, 
mammalian cells, yeast and Baculovirus-infected insect cells. PR01788 is described as having 
amino acid sequence identity with Dayhoff sequence "GARP_HUMAN", a leucine-rich repeat- 
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containing protein encoded by a gene localized in the 1 1 ql 4 chromosomal region and as being a 
newly identified member of the leucine-rich repeat-containing family (see, for example, page 
353, lines 17-23). Finally, Example 143, in the specification at page 494, line 20, to page 508, 
line 28, sets forth a Gene Amplification assay which shows that the PRO 1788 gene is amplified 
in the genome of certain human colon cancers (see page 506, lines 26-33, and Table 8). 

VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 

1 . Whether Claims 28-35 and 38-40 satisfy the utility requirement under 35 U.S.C. 
§§101/112, first paragraph. 

2. Whether Claims 28-35 and 38-40 satisfy the written description requirement of 35 
U.S.C. §112, first paragraph. 

VII. ARGUMENTS 
Summary of the Arguments 

Issue 1 : Utility/ Enablement 

Appellants rely upon the gene amplification data of the PRO 1788 gene for patentable 
utility of the PRO 1788 polypeptide. This data is clearly disclosed in the instant specification in 
Example 143 which discloses that the gene encoding PRO 1788 showed significant amplification, 
ranging from 2.12 to 6-fold , in eight colon tumors. 

Appellants have submitted, in their Response filed January 18, 2005, a Declaration by Dr. 
Audrey Goddard, which explains that a gene identified as being amplified at least 2-fold by the 
disclosed gene amplification assay in a tumor sample relative to a normal sample is useful as a 
marker for the diagnosis of cancer , and for monitoring cancer development and/or for measuring 
the efficacy of cancer therapy. Therefore, such a gene is useful as a marker for the diagnosis of 
colon cancer , and for monitoring cancer development and/or for measuring the efficacy of cancer 
therapy. 

Appellants have also submitted throughout prosecution history ample evidence to show 
that, in general, if a gene is amplified in cancer, it is more likely than not that the encoded protein 
will be expressed at an elevated level. First, the articles by Orntoft et ai, Hyman et ai, and 
Pollack et al collectively teach that in general, gene amplification increases mRNA expression . 
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Second. Appellants have submitted over a hundred references, along with the 
Declarations of Dr. Paul Polakis and Dr. Randy Scott with their Responses filed on 
January 18, 2005 and August 11, 2006, which collectively teach that, in general, there is a 
correlation between mRNA levels and polypeptide levels . 

Further, Appellants submit that one of ordinary skill in the art would know how to make 
and use the recited polypeptide for the diagnosis of colon cancer without any undue 
experimentation, based on the detailed teachings in the specification. 

Accordingly, this enablement rejection under 35 U.S.C. §§101 and 1 12, first paragraph 
should be withdrawn. 
Issue 2: Written Description 

Claims 28-33 and 39-40 stand rejected under 35 U.S.C. §112, first paragraph as allegedly 
lacking adequate written description. In particular, the Examiner asserts that "a recitation related 
to DNA does not reasonably constitute a 'functional limitation' for the claimed polypeptides. 55 
The Examiner further asserts that Appellants have not described "a> representative number of 
species that have 80-99% homology to SEQ ID NO:397, such that it is clear that they were in 
possession of a genus of polypeptides functionally similar to SEQ ID NO:397. ,: (Pages 10-11 of 
the Office Action mailed May 10, 2005). 

Appellants respectfully submit that the instant claims are similar to the exemplary claim 
in Example 10 of the revised Training Manual on Written Description Guidelines issued by the 
U.S. Patent Office. Appellants respectfully submit that the instant specification evidences the 
actual reduction to practice of the amino acid sequence of SEQ ID NO:397. Thus, the genus of 
polypeptides with at least 80% sequence identity to SEQ ID NO:397, would meet the 
requirement of 35 U.S.C. §112, first paragraph, as providing adequate written description. 

Response to Rejections 

Issue 1. Claims 28-35 and 38-40 arc Supported by a Credible, Specific and Substantial 
Asserted Utility, and Thus, Meet the Utility Requirement of 35 U.S.C. §§101/112, First 
Paragraph 

The sole basis for the Examiner's rejection of Claims 28-35 and 38-40 under this section 
is that the data presented in Example 143 of the present specification is allegedly insufficient 
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under the present legal standards to establish a patentable utility under 35 U.S.C. §101 for the 
presently claimed subject matter. 

Claims 28-35 and 38-40 stand further rejected under 35 U.S.C. §112, first paragraph, 
allegedly "since the claimed invention is not supported by either a specific and substantial asserted 
utility or a well established utility for the reasons set forth above, one skilled in the art clearly would 
not know how to use the claimed invention." 

Appellants strongly disagree and, therefore, respectfully traverse the rejection. 

A. The Legal Standard For Utility Under 35 U.S.C. §101 

According to 35 U.S.C. §101: 

Whoever invents or discovers any new and useful process, machine, manufacture, 
or composition of matter, or any new and useful improvement thereof, may obtain 
a patent therefor, subject to the conditions and requirements of this title. 
(Emphasis added). 

In interpreting the utility requirement, in Brenner v. Manson, the Supreme Court held 
that the quid pro quo contemplated by the U.S. Constitution between the public interest and the 
interest of the inventors required that a patent Applicant disclose a "substantial utility" for his or 

2 

her invention, i.e., a utility "where specific benefit exists in currently available form." The 
Court concluded that "a patent is not a hunting license. It is not a reward for the search, but 
compensation for its successful conclusion. A patent system must be related to the world of 

commerce rather than the realm of philosophy." 3 

4 

Later, in Nelson v. Bowler, the C.C.P.A. acknowledged that tests evidencing 
pharmacological activity of a compound may establish practical utility, even though they may 
not establish a specific therapeutic use. The Court held that "since it is crucial to provide 
researchers with an incentive to disclose pharmaceutical activities in as many compounds as 

1 Brenner v. Manson, 383 U.S. 519, 148 U.S.P.Q. (BNA) 689 (1966). 

2 Id at 534, 148 U.S.P.Q. (BNA) at 695. 

3 Id at 536, 148 U.S.P.Q. (BNA) at 696. 

4 Nelson v. Bowler, 626 F.2d 853, 206 U.S.P.Q. (BNA) 881 (C.C.P.A. 1980). 
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possible, we conclude adequate proof of any such activity constitutes a showing of practical 
utility." 5 

6 

In Cross v. lizuka, the C.A.F.C. reaffirmed Nelson, and added that in vitro results might 
be sufficient to support practical utility, explaining that "in vitro testing, in general, is relatively 
less complex, less time consuming, and less expensive than in vivo testing. Moreover, in vitro 
results with the particular pharmacological activity are generally predictive of in vivo test results, 

i.e. there is a reasonable correlation there between." 7 The Court perceived, "No insurmountable 
difficulty" in finding that, under appropriate circumstances, "m vitro testing, may establish a 

8 

practical utility." 

The case law has also clearly established that Appellants' statements of utility are usually 

9 

sufficient, unless such statement of utility is unbelievable on its face. The PTO has the initial 
burden to prove that Appellants' claims of usefulness are not believable on their face. 10 In 
general, an Appellant's assertion of utility creates a presumption of utility that will be sufficient 
to satisfy the utility requirement of 35 U.S.C. §101, "unless there is a. reason for one skilled in 

1112 

the art to question the objective truth of the statement of utility or its scope." ' 



5 Id. at 856, 206 U.S.P.Q. (BNA) at 883. 

6 Cross v. lizuka, 753 F.2d 1047, 224 U.S.P.Q. (BNA) 739 (Fed. Cir. 1985). 

7 Id at 1050, 224 U.S.P.Q. (BNA) at 747. 

"id. 

9 In re Gazave, 379 F.2d 973, 154 U.S.P.Q. (BNA) 92 (C.C.P.A. 1967). 

10 Ibid: 

H In re Longer, 503 F.2d 1380,1391, 183 U.S.P.Q. (BNA) 288, 297 (C.C.P.A. 1974). 

12 See also In re Jolles, 628 F.2d 1322, 206 U.S.P.Q. 885 (C.C.P.A. 1980); In re Irons. 
340 F.2d 974, 144 U.S.P.Q. 351 (1965); In re Sichert, 566 F.2d 1154, 1159, 196 U.S.P.Q. 209, 
212-13 (C.C.P.A. 1977). 
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Compliance with 35 U.S.C. §101 is a question of fact. The evidentiary standard to be 
used throughout ex parte examination in setting forth a rejection is a preponderance of the 

14 

totality of the evidence under consideration. Thus, to overcome the presumption of truth that 
an assertion of utility by the Appellant enjoys, the Examiner must establish that it is more likely 
than not that one of ordinary skill in the art would doubt the truth of the statement of utility. 
Only after the Examiner made a proper prima facie showing of lack of utility, does the burden of 
rebuttal shift to the Appellant. The issue will then be decided on the totality of evidence. 

The well established case law is clearly reflected in the Utility Examination Guidelines 

("Utility Guidelines"), 3 which acknowledge that an invention complies with the utility 
requirement of 35 U.S.C. §101, if it has at least one asserted "specific, substantial, and credible 
utility" or a "well-established utility." Under the Utility Guidelines, a utility is "specific'' when 
it is particular to the subject matter claimed. For example, it is generally not enough to state that 
a nucleic acid is useful as a diagnostic without also identifying the conditions that are to be 
diagnosed. 

In explaining the "substantial utility" standard, M.P.E.P. §2107.01 cautions, however, 
that Office personnel must be careful not to interpret the phrase "immediate benefit to the 
public" or similar formulations used in certain court decisions to mean that products or services 
based on the claimed invention must be "currently available" to the public in order to satisfy the 
utility requirement. "Rather, any reasonable use that an applicant has identified for the invention 
that can be viewed as providing a public benefit should be accepted as sufficient, at least with 

regard to defining a 'substantial' utility." 16 Indeed, the Guidelines for Examination of 



13 Raytheon v. Roper, 724 F.2d 951, 956, 220 U.S.P.Q. (BNA) 592, 596 (Fed. Cir. 1983) 
cert, denied, 469 US 835 (1984). 

M In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d (BNA) 1443, 1444 (Fed. Cir. 

1992). 

,D 66 Fed. Reg. 1092 (2001). 
16 M.P.E.P. §2107.01. 
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Applications for Compliance With the Utility Requirement/ 7 gives the following instruction to 
patent examiners: "If the Applicant has asserted that the claimed invention is useful for any 
particular practical purpose . . . and the assertion would be considered credible by a person of 
ordinary skill in the art, do not impose a rejection based on lack of utility." 

B. Proper Application of the Legal Standard 

Appellants submit that the evidentiary standard to be used throughout ex parte 
examination of a patent application is a preponderance of the totality of the evidence under 
consideration. Thus, to overcome the presumption of truth that an assertion of utility by the 
Appellant enjoys, the Examiner must establish that it is more likely than not that one of ordinary 
skill in the art would doubt the truth of the statement of utility. Only after the Examiner has 
made a proper prima facie showing of lack of utility, does the burden of rebuttal shift to the 
Appellant. 

Appellants respectfully submit that the data presented in Example 143 starting on 
page 494 of the specification of the specification and the cumulative evidence of record, which 
underlies the current dispute, indeed support a "specific, substantial and credible" asserted utility 
for the presently claimed invention. 

Patentable utility for the PR01788 polypeptides and their antibodies is based upon the 
gene amplification data for the gene encoding the PR01788 polypeptide. Example 143 describes 
the results obtained using a very well-known and routinely employed polymerase chain reaction 
(PCR)-based assay, the TaqMan IM PCR assay, also referred to herein as the gene amplification 
assay. This assay allows one to quantitatively measure the level of gene amplification in a given 
sample, say, a tumor extract, or a cell line. It was well known in the art at the time the invention 
was made that gene amplification is an essential mechanism for oncogene activation. Appellants 
isolated genomic DNA from a variety of primary cancers and cancer cell lines that are listed in 
Table 8, including primary lung cancers of the type and stage indicated in Table 7. The tumor 
samples were tested in triplicates with Taqman 1M primers and with internal controls, beta-actin 
and GADPH in order to quantitatively compare DNA levels between samples. As a negative 

17 M.P.E.P. §2107 11(B)(1). 
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control, DNA was isolated from the cells of ten normal healthy individuals, which was pooled 
and used as a control. Gene amplification was monitored using real-time quantitative TaqMan™ 
PCR. Table 8 shows the resulting gene amplification data. Further, Example 143 explains that 
the results of TaqMan™ PCR are reported in ACt units, wherein one unit corresponds to one PCR 
cycle or approximately a 2-fold amplification relative to control, two units correspond to 4-fold 
amplification, 3 units to 8-fold amplification etc. 

Appellants respectfully submit that a ACt value of at least 1 .0 was observed for PRO 1 788 
in at least eight of the tumors listed in Table 8. PRO 1788 showed approximately 1 .09-2.58 ACt 
units which corresponds to 2 1 09 -2 2 :>8 fold amplification or 2.12-fold to 6-fold amplification in 
primary colon tumors (CT1, CT3, CT4, CT8, CT9, CT10, CT12 and CT14). (See Table 8 and p 
page 506, lines 26-33 of the specification). Accordingly, the present specification clearly 
discloses overwhelming evidence that the gene encoding the PRO 1788 polypeptide is 
significantly amplified in colon tumors. 

In support of their showing that these gene amplification values are significant, 

Appellants submitted, in the Response filed January 18, 2005, a Declaration by Dr. Audrey 

Goddard. Appellants particularly draw the Board's attention to page 3 of the Goddard 

Declaration which clearly states that: 

It is further my considered scientific opinion that an at least 2-fold increase in 
gene copy number in a tumor tissue sample relative to a normal (i.e., non-tumor) 
sample is significant and useful in that the detected increase in gene copy number 
in the tumor sample relative to the normal sample serves as a basis for using 
relative gene copy number as quantitated by the TaqMan PCR technique as a 
diagnostic marker for the presence or absence of tumor in a tissue sample of 
unknown pathology. Accordingly, a gene identified as being amplified at least 2- 
fold by the quantitative TaqMan PCR assay in a tumor sample relative to a normal 
sample is useful as a marker for the diagnosis of cancer, for monitoring cancer 
development and/or for measuring the efficacy of cancer therapy. 
(Emphasis added). 

Appellants point out that the Declaration by Dr. Audrey Goddard provides a statement by 
an expert in the relevant art that "fold amplification" values of at least 2-fold are considered 
significant in the TaqMan™ PCR gene amplification assay. Appellants particularly draw the 
Board's attention to page 3 of the Goddard Declaration which clearly states that: 

-10- 

On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 10/015,499 
Attorney's Docket No. CNE-2830 P1C42 



It is further my considered scientific opinion that an at least 2-fold increase in 
gene copy number in a tumor tissue sample relative to a normal (i.e., non-tumor) 
sample is significant and useful in that the detected increase in gene copy number 
in the tumor sample relative to the normal sample serves as a basis for using 
relative gene copy number as quantitated by the TaqMan PGR technique as a 
diagnostic marker for the presence or absence of tumor in a tissue sample of 
unknown pathology. Accordingly, a gene identified as being amplified at least 2- 
fold by the quantitative TaqMan PCR assay in a tumor sample relative to a normal 
sample is useful as a marker for the diagnosis of cancer, for monitoring cancer 
development and/or for measuring the efficacy of cancer therapy. 
(Emphasis added). 

Thus, According to the Goddard Declaration, Appellants maintain that the 2.12 to 6-fold 
amplification disclosed for the PROl 788 gene is significant and forms the basis for the utility 
claimed herein. As any skilled artisan in the field of oncology would easily appreciate that this 
gene is a good candidate marker for diagnosing colon tumor and would clearly find utility for the 
PRO 1788 gene as a diagnostic for colon cancer or for diagnosing individuals at risk for 
developing colon cancer. 

The Examiner has asserted thai in order for PRO 1788 polypeptides to be overexpressed 
in tumors, amplified genomic DNA would have to correlate with increased mRNA levels and 
increased polypeptide levels. The Examiner has further asserted that the specification does not 
provide data regarding PROl 788 mRNA or PROl 788 polypeptide levels in colon tumors. (Page 
6 of the Final Office Action mailed March 10, 2008). 

The Examiner's reference to the lack of necessary correlation or accurate prediction in some 
of the rejections clearly shows that the Examiner applies an improper legal standard when making 
this rejection. The evidentiary standard to be used throughout ex parte examination in setting 
forth a rejection is a preponderance of the totality of the evidence under consideration. Thus, to 
overcome the presumption of truth that an assertion of utility by the Applicant enjoys, the 
Examiner must establish that it is more likely than not that one of ordinary skill in the art would 
doubt the truth of the statement of utility. Only after the Examiner has made a proper prima 
facie showing of lack of utility, does the burden of rebuttal shift to the Applicant. As discussed 
below, the references cited by the Examiner do not suffice to make a prima facie case that more 



-11- 
On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 10/015,499 
Attorney's Docket No. CNE-2830 P1C42 



likely than not no generalized correlation exists between gene (DNA) amplification and 
increased polypeptide levels. 

In contrast, Appellants have submitted ample evidence to show that, in general, if a gene 
is amplified in cancer, it is more likely than not that the encoded protein will be expressed at an 
elevated level. First, the articles by Orntoft et al., Hyman et ai, and Pollack et ai, (made of 
record in Appellants' Response filed August 19, 2004) collectively teach that in general. Rene 
amplification increases mRNA expression . Second, as the Examiner has acknowledged, the art 
teaches that, in general, there is a correlation between mRNA levels and polypeptide levels . 

Accordingly, one of skill in the art would reasonably expect in this instance, based on the 
amplification data for the PR01788 gene, that the PR01788 polypeptide is concomitantly 
overexpressed. Thus, the claimed antibodies to the PRO 1788 polypeptide have utility in the 
diagnosis of cancer. 

The Examiner has asserted that further research would have been required of the skilled 
artisan to reasonably confirm that PRO I 788 is overexpressed in any cancer to the extent that it 
could be used as a cancer diagnostic agent; thus the asserted utility is not substantial. (Page 8 
of the Final Office Action mailed March 10, 2008). 

As discussed in previous responses of record, M.P.E.P. §2107.01 cautions Office 
personnel not to interpret the phrase "immediate benefit to the public" or similar formulations 
used in certain court decisions to mean that products or services based on the claimed invention 
must be "currently available 15 to the public in order to satisfy the utility requirement. "Rather, 
any reasonable use that an Applicant has identified for the invention that can be viewed as 
providing a public benefit should be accepted as sufficient, at least with regard to defining a 
'substantial' utility." 1 8 Indeed, the Guidelines for Examination of Applications for Compliance 
With the Utility Requirement, 19 gives the following instruction to patent examiners: "If the 
Applicant has asserted that the claimed invention is useful for any particular practical purpose . . 

r 

. and the assertion would be considered credible by a person of ordinary skill in the art, do not 
impose a rejection based on lack of utility." 

18 M.P.E.P. §2107.01. 

19 M.P.IZ.P. §2107 11(B)(1). 
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Appellants' position is based on the overwhelming evidence from gene amplification data 
disclosed in the specification which clearly indicate that the gene encoding PR01788 is 
significantly amplified in certain lung and colon tumors. Based on the working hypothesis 
among those skilled in the art that if a Rene is amplified in cancer, the encoded protein is likely to 
be expressed at an elevated level , one skilled in the art would simply accept that since the 
PRO 1 788 gene is amplified, the PRO 1788 polypeptide would be more likely than not over- 
expressed. Thus, data relating to PRO 1788 polypeptide expression may be used for the same 
diagnostic and prognostic purposes as data relating to PRO 1788 gene expression. Therefore, 
based on the disclosure in the specification, no further research would be necessary to determine 
how to use the claimed PR01788 polypeptides, because the current invention is fully enabled by 
the disclosure of the present application. 

Accordingly, Appellants submit that based on the general knowledge in the art at the time 
the invention was made and the teachings in the specification, the specification provides clear 
guidance as to how to interpret and use the data relating to PR01788 polypeptide expression and 
that the claimed PRO 1788 polypeptide have utility in the diagnosis of cancer. 

C. A prima facie case of lack of utility has not been established 

Appellants respectfully submit that the Examiner has not made a proper prima facie 
showing of lack of utility, because the Examiner has not shown that Appellants' asserted utility 
is more likely than not incorrect. 

The Examiner has asserted that "ftjhe claimed functional use of DNA for detecting colon 

tumors is not equivalent to identifying a use far the claimed polypeptide. " (Page 4 of the Office 

Action mailed March 1 0, 2008). The Examiner has further asserted that the " Increase in gene 

copy number ' (i.e., DNA data) is not equivalent to increased polypeptide levels. " (Page 4 of the 

Office Action mailed March JO, 2008). In support of this assertion, the Examiner refers to the 

references of record by Haynes, flu, Chen, Pennica and Konopka. 

■ As a preliminary matter, Appellants respectfully submit that it is not a legal requirement 

to establish that gene amplification "necessarily" results in increased expression at the mRNA 

and polypeptide levels or that polypeptide levels can be "accurately predicted." As discussed 

above, the evidentiary standard to be used throughout ex parte examination of a patent 

application is a preponderance of the totality of the evidence under consideration. Accordingly, 
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Appellants submit that in order to overcome the presumption of truth that an assertion of utility 
by the applicant enjoys, the Examiner must establish that it is more likely than not that one of 
ordinary skill in the art would doubt the truth of the statement of utility. Therefore, it is not 
legally required that there be a "necessary" correlation between the data presented and the 
claimed subject matter. The law requires only that one skilled in the art should accept that such a 
correlation is more likely than not to exist . Appellants respectfully submit that when the proper 
evidentiary standard is applied, a correlation must be acknowledged. 

Pennica et al. 

Appellants submit that Pennica et al. does not show a lack of correlation between gene 
(DMA) amplification and mRNA levels. According to the quoted statement from Pennica et al., 
"WISP-1 gene amplification in human lung tumors showed a correlation between DNA 
amplification and over-expression, whereas overexpression of W1SP-3 RNA was seen in the 
absence of DNA amplification. In contrast, WISP-2 DNA was amplified in lung tumors, but its 
mRNA expression was significantly reduced in the majority of tumors compared with expression 
in normal lungic mucosa from the same patient." From this, the Examiner correctly concludes 
that increased copy number does not necessarily result in increased polypeptide expression. The 
standard, however, is not absolute certainty. The fact that in the case of a specific class of 
closely related molecules there seemed to be no correlation with gene amplification and the level 
of mRN A/protein expression, does not establish that it is more likely than not, in general, that 
such correlation does not exist. The Examiner has not shown whether the lack or correlation 
observed for the family of WISP polypeptides is typical, or is merely a discrepancy, an exception 
to the rule of correlation . Indeed, the working hypothesis among those skilled in the art is that, if 
a gene is amplified in cancer, the encoded protein is likely to be expressed at an elevated level. 
In fact, as noted even in Pennica et al., "[a]n analysis of WISP-] gene amplification and 
expression in human lung tumors showed a correlation between DNA amplification and over- 
expression (Pennica et al., page 14722, left column, first full paragraph, emphasis added). 

Accordingly, Appellants respectfully submit that Pennica et al. teaches nothing 
conclusive regarding the absence of correlation between amplification of a gene and over- 
expression of the encoded WISP polypeptide. More importantly, the teaching of Pennica et al. is 

-14- 

On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 10/015,499 
Attorney's Docket No. GNE-2830 P1C42 



specific to WISP genes. Pennica et al. has no teaching whatsoever about the correlation of gene 
amplification and protein expression in general . 

Konopka et al. 

Regarding Konopka et al, Appellants submit that the Examiner has completely 
misinterpreted the teachings in the cited reference. Contrary to the Examiner's assertions, 
Konopka et al. does not support the position that DNA amplification is not correlated with 
mRNA overexpression . Konopka et al. show only that, of the cell lines known to have increased 
abl protein expression, only one had amplification of the abl gene (page 4051, col. 1). This 
result proves only that increased mRNA and protein expression levels can result from causes 
other than gene amplification. Konopka et al. do not demonstrate that when gene amplification 
does occur, it does not result in increased mRNA and protein expression levels, particularly 
given that the cell line with amplification of the abl gene did show increased abl mRNA and 
protein expression levels. Furthermore, Konopka et al. supports Appellants' position that mRNA 
levels correlate with protein levels. Konopka et al. state that "the 8-kb mRNA that encodes 
P210 c abl was detected at a 10-fold higher level in SK-CML7bt-333 ( Fig. 3 A, -f) than in SK- 
CML16BH (B, +), which correlated with the relative level of P210 c ' abl detected in each cell 
line. Analysis of additional cell lines demonstrated that the level of 8-kb mRNA directly 
correlated with the level of P210 c " abl (Table 1)" (page 4050, col. 2, emphasis added). 

Haynes et a I 

The Examiner has cited Haynes et al. as allegedly providing evidence that "polypeptide 
levels cannot be accurately predicted from mRNA levels, and that variances as much as 40-fold 
or even 50-fold were not uncommon." (Page 6 of the Office Action mailed May 10, 2005). The 
law does not require the existence of a strong or linear correlation between mRNA and protein 
levels. Nor does the law require that protein levels be "accurately" predicted. According to the 
authors themselves, the Haynes data confirm that there is a "general trend" between protein 
expression and transcript levels (page 1863, col. 1), which meets the "more likely than not 
standard" and shows that a positive correlation exists between mRNA and protein. For example, 
in Figure 1 , there is a positive correlation between mRNA and protein levels amongst most of the 
80 yeast proteins studied. In fact, very few data points deviated or scattered away from the 
expected normal and no data points showed a negative correlation between mRNA and protein 
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levels (i.e. an increase in mRNA resulted in a decrease in protein levels). The analysis by 
Haynes et al. is not relevant to the current application. Haynes et al. studied yeast cells and not 
human cells. Haynes et al. note that their analysis focused on the 80 most abundant proteins in 
the yeast lysate (page 1867). Haynes et al. state that "since many important regulatory protein 
are present only at low abundance, these would not be amenable to analysis" (page 1 867). 
Further, Haynes et al compared the protein expression levels of these naturally abundant 
proteins to mRNA expression levels from published SAGE frequency tables (page 1863). 
Accordingly, Haynes et al did not compare mRNA expression levels and protein levels in the 
same yeast cells. Thus the analysis by Haynes et al. is not applicable to the present application. 
Hu etal. 

The Examiner has further cited Hu et al to the effect that genes displaying a 5-fold 
change or less in mRNA expression in tumors compared to normal showed no evidence of a 
correlation between altered gene expression and a known role in the disease. However, among 
genes with a 10-fold or more change in expression level, there was a strong and significant 
correlation between expression level and a published role in the disease. (Page 5 of the Office 
Action mailed May 10, 2005). 

Appellants submit that in order to overcome the presumption of truth that an assertion of 
utility by the applicant enjoys, the Examiner must establish that it is more likely than not that one 
of ordinary skill in the art would doubt the truth of the statement of utility. Accordingly, 
contrary to the Examiner's assertion, Appellants submit that Hu et al does not conclusively 
show that it is more likely than not that Rene amplification does not result in increased 
expression at the mRNA and polypeptide levels. First, the title of Hu et al is "Analysis of 
Genomic and Proteomic Data Using Advanced Literature Mining." As the title clearly suggests, 
the conclusion suggested by Hu et al is merely based on a statistical analysis of the information 
disclosed in the published literature. As Hu et al states, "We have utilized a computational 
approach to literature mining to produce a comprehensive set of gene-disease relationships." In 
particular, Hu et al relied on the MedGene Database and the Medical Subject Heading (MeSH) 
files to analyze the gene-disease relationship. More specifically, Hu et al "compared the 
MedGene breast cancer gene list to a gene expression data set generated from a micro-array 
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analysis comparing breast cancer and normal breast tissue samples. 55 (See page 408, right 
column). 

Therefore. Appellants first submit that the reference by Hu et al. only studies the 
statistical analysis of micro-array data and not gene amplification data. Therefore, their findings 
would not be directly applicable to gene amplification data. In addition, Appellants respectfully 
submit that the Hu et al reference does not show that a lack of correlation between microarray 
data and the biological significance of cancer genes is typical. 

According to Hu et al, "different statistical methods" were applied to "estimate the 
strength of gene-disease relationships and evaluated the results. 55 (See page 406. left column, 
emphasis added). Using these different statistical methods, Hu et al "[ajssessed the relative 
strengths of gene-disease relationships based on the frequency of both co-citation and single 
citation. 55 (See page 41 1, left column). It is well known in the art that various statistical methods 
allow different variables to be manipulated to affect the outcome. For example, the authors 
admit, "Initial attempts to search the literature using 55 the list of genes, gene names, gene 
symbols, and frequently used synonyms, generated by the authors "revealed several sources of 
false positives and false negatives. 55 (See page 406, right column). The authors further admit 
that the false positives caused by "duplicative and unrelated meanings for the term 55 were 
"difficult to manage. 55 Therefore, in order to minimize such false positives, Hu et al. disclose 
that these terms "had to be eliminated entirely, thereby reducing the false positive rate but 
unavoidably under-representing some genes. 55 Id. Hence, Appellants respectfully submit that in 
order to minimize the false positives and negatives in their analysis, Hu et al manipulated 
various aspects of the input data. 

Appellants further submit that the statistical analysis by Hu et ai is not a reliable standard 
because the frequency of citation reflects only the current research interest of a molecule rather 
than the true biological function of the molecule. Indeed, the authors acknowledge that 
"[relationship established by frequency of co-citation do not necessarily represent a true 
biological link. 55 (See page 41 1, right column). It often happens in scientific study that 
important molecules are overlooked by the scientific society for many years until the discovery 
of their true function. Therefore, Appellants submit that Hu et al. drew their conclusion based on 
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a very unreliable standard and that their research does not provide any meaningful information 
regarding the correlation between microarray data and the biological significance of a molecule. 

Even assuming that Hu et al provide evidence to support a true relationship, the 
conclusion in Hu et al. only applies to a specific type of breast tumor (estrogen receptor (Ex- 
positive breast tumor) and can not be generalized as a principle governing microarray study of 
breast cancer in general, let alone the various other types of cancer genes in general . In fact, 
even Hu et al admit that/ 4 |i]t is likely that this threshold will change depending on the disease as 
well as the experiment. Interestingly, the observed correlation was only found among ER- 
positive (breast) tumors not ER-negative tumors." (See page 412, left column). Therefore, 
based on these findings, the authors add, "This may reflect a bias in the literature to study the 
more prevalent type of tumor in the population. Furthermore, this emphasizes that caution must 
be taken when interpreting experiments that may contain subpopulations that behave very 
differently/' Id. (Emphasis added). 

In summary, Appellants respectfully submit that the Examiner has not shown that a lack 
of correlation between microarray data and the biological significance of cancer genes, as 
observed for ER-positive breast tumor, is typical . Since the standard is not absolute certainty, a 
prima facie showing of lack of utility has not been made in this instance. 

Chen et al. 

The Examiner has cited Chen et al to show that an increase in mRNA level does not 
correlate with an increase in protein level (Page 5 of the Office Action mailed March 10, 2008) 

First, Appellants note that proteins selected for study by Chen et al., were those 
detectable by staining of 2D gels. As noted in, for example, Haynes et al. there are problems 
with selecting proteins detectable by 2D gels. "It is apparent that without prior enrichment only 
a relatively small and highly selected population of long-lived, highly expressed proteins is 
observed. There are many more proteins in a given cell which are not visualized by such 
methods. Frequently, it is the low abundance proteins that execute key regulatory functions" 
(Haynes, p. 1 870, col. 1). Thus, Chen et al. by selecting proteins detectable by staining of 2D 
gels are likely to have excluded from their analysis many of the proteins most likely to be 
significant as cancer markers. 
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Secondly, Chen et al looked at expression levels across a set of samples including a large 
number of tumor samples (76) along with a much smaller number of normal samples (9). The 
tumor samples were taken from stage 1 and stage III lung adenocarcinomas, which were 
classified as bronchoaveolar, bronchial derived or both bronchial and bronchoaveolar derived. 
Accordingly, the tissues examined were from different tissues in different stages of normal or 
cancerous growth. The authors determined the relationship between mRNA and protein 
expression by using the average expression values for all samples . The average value for each 
protein or mRNA was generated using all 85 lung tissue samples. This resulted in negative 
normalized protein values in some cases. Further, the authors chose an arbitrary threshold of 
0.115 for the correlation lo be considered significant. Accordingly, the Chen paper does not 
account for different expression in different tissues or different stages of cancer. 

Thirdly, no attempt was made to compare expression levels in normal versus tumor 
samples, and in fact the authors concede that they had too few normal samples for meaningful 
analysis (Chen, p. 3 10, col. 2). As a result, the analysis in the Chen paper shows only that a 
number of randomly selected proteins have varying degrees of correlation between mRNA and 
protein expression levels within a set of different lung adenocarcinoma samples. The Chen 
paper does not address the issue of whether increased mRNA levels in the tumor samples taken 
together as one group, as compared to the normal samples as a group, correlated with increased 
protein levels in tumor tissue versus normal tissue. Accordingly, the results presented in the 
Chen paper are not applicable to the present application. 

The correct test of utility is whether the utility is "more likely than not." In the case of 
the Chen reference, even if the analysis presented is correct (which is disputed), a review of the 
correlation coefficient data presented in the Chen et al. paper indicates that it is more likely than 
not that increased mRNA expression correlates with increased protein expression. A review of 
Table 1 , which lists 66 genes [the paper incorrectly states there are 69 genes listed] for which 
only one protein isoform is expressed, shows that 40 genes out of 66 had a positive correlation 
between mRNA expression and protein expression. This clearly meets the test of "more likely 
than not." Similarly, in Table II, 30 genes with multiple isoforms [again the paper incorrectly 
states there are 29] were presented. In this case, at least 22 genes had one isoform showing a 
positive correlation between mRNA expression and protein expression. Furthermore, 12 genes 
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out of 29 showed a significant positive correlation [as determined by the authors] for at least one 
isoform. No genes showed a significant negative correlation. It is not surprising that not all 
isoforms for each gene positively correlate with mRNA expression. As the Examiner may be 
aware, some isoforms are likely non-functional proteins. Thus, Table 11 further supports 
Appellants' assertion that it is more likely than not that protein levels correlate with mRNA 
expression levels. 
Levvin 

The Examiner has referred to Lewin as teaching that "control of gene expression can 
occur at multiple stages, and that production of mRNA cannot inevitably be equated with 
production of protein, " (Page 6 of the Office Action mailed March JO, 2008). 

Appellants respectfully submit that the utility standard is not absolute certainty. Rather, 
to overcome the presumption of truth that an assertion of utility by an applicant enjoys, the PTO 
must establish that it is more likely than not that one of ordinary skill in the art would doubt the 
truth of the statement of utility. Therefore, Appellants do not need to establish that transcription 
initiation is the only means of regulating gene expression in order to meet the utility standard. 
Instead, as long as it is the most common point of regulation , as admitted by the Examiner, it 
would be more likely than not that a change in the transcription level of a gene gives rise to a 
change in translation level of a gene. Appellants note that Lewin makes clear that it is far more 
likely than not that protein levels for any given gene are regulated at the transcriptional level. In 
particular, Lewin states that "having acknowledged that control of gene expression can occur at 
multiple stages, and that production of RNA cannot inevitably be equated with production of 
protein, it is clear that the overwhelming majority of regulatory events occur at the initiation of 
transcription /' Genes VI at 847-848 (Emphasis added). Thus, the utility standard is met. 

Futchcr et al. 

The Examiner has referred to the reference of record by Futcher et al. as stating that 
"Gygi et al feel that mRNA abundance is a poor predictor of protein abundance. " (Page 7 of 
the Office Action mailed March 10, 2008). 

Appellants respectfully point out that Futcher et al. refer to Gygi et al. in the process of 
explaining in detail why their results did show a correlation between mRNA and protein levels 
even for low abundance proteins , while the previous study by Gygi et al. did not. In fact, 
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Futcher et al. concluded that "several statistical methods show a strong and significant 
correlation between mRNA abundance and protein abundance." (Page 7360, col. 2; 
Emphasis added). 

The authors note that Gygi et al. completed a similar study that generated broadly similar 
data, but reached different conclusions. Futcher et al. point out that "the different conclusions 
are also partly due to different methods of statistical analysis, and to real differences in data." 
Futcher et al. note that Gygi et al. used the Pearson product-moment correlation coefficient (r p ) 
and point out that "a calculation of r p is inappropriate" because the mRNA and protein 
abundances are not normally distributed. (Page 7367, col. 1). In contrast, Futcher et al. used 
two different statistical approaches to determining the correlation between mRNA and protein 
abundances. First, they used the Spearman rank correlation coefficient (r s ), an nonparametric 
statistic that does not require the data to be normally distributed. Using the r s , the authors found 
that mRNA abundance was well correlated with protein abundance (r s = 0.74). Applying this 
statistical approach to the data of Gygi et al. also resulted in a good correlation (r s = 0.59), 
although the correlation was not quite as strong as for the Futcher et al. data. In a second 
approach, Futcher et al. transformed the mRNA and protein data to forms where they were 
normally distributed, in order to allow calculation of an r p . Two types of transformation (Box- 
Cox and logarithmic) were used, and both resulted in good correlations between mRNA and 
protein abundance for Futcher et al.'s data. 

Futcher et al. also note that the two studies used different methods of measuring protein 
abundance. Gygi et al. cut spots out of each gel and measured the radiation in each spot by 
scintillation counting, whereas Futcher et al. used phosphorimaging of intact gels coupled to 
image analysis. Futcher et al. point out that Gygi et al. may have systematically overestimated 
the amount of the lowest-abundance proteins , because of the difficulty in accurately cutting out 
very small spots from the gel, and because of difficulties in background subtraction for small, 
weak spots. 

In addition, Futcher et al. note that they used both SAGE data and RNA hybridization 
data to determine mRNA abundances, which is most helpful to accurately measure the least 
abundant mRNAs. As a result, while the Futcher data set "maintains a good correlation between 
mRNA and protein abundance even at low protein abundance" (page 7367, col. 2), the Gygi data 

-21- 

On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 10/015,499 
Attorney's Docket No. GNE-2830 P1C42 



shows a strong correlation for the most abundant proteins , but a poor correlation for the least 
abundant proteins in their data set. Futcher et al. conclude that "the poor correlation of protein 
to mRNA for the nonabundant proteins of Gygi et al. may reflect difficulty in accurately 
measuring these nonabundant proteins and mRNAs, rather than indicating a truly poor 
correlation in vivo." (Page 7367, col. 2; Emphasis added). Thus, while these lowest abundant 
proteins do show a poor correlation, this is almost certainly due to the less accurate methods used 
to measure the abundance of these proteins, and not to any actual lack of correlation. 

Appellants further note that, as Futcher el al. was published later than Gygi el al., 
Futcher' s conclusions should be considered as the updated view in the art, which supports the 
existence of a correlation between mRNA and protein levels. 

In summary, the Patent Office has failed to meet its initial burden of proof that 
Appellants' claims of utility are not substantial or credible. The arguments presented by the 
Examiner in combination with the cited articles do not provide sufficient reasons to doubt the 
statements by Appellants that PRO 1788 has utility. As discussed above, the law does not require 
that DNA amplification is "always" associated with overexpression of the gene product. 
Therefore, Appellants submit that the Examiner's reasoning is based on a misrepresentation of 
the scientific data presented in the above cited reference and application of an improper, 
heightened legal standard. In fact, contrary to what the Examiner contends, the art indicates that, 
if a gene is amplified in cancer, it is more likely than not that the encoded protein will be 
expressed at an elevated level. 

It is "more likely than not" for amplified genes to have increased mRNA 

On the contrary, Appellants submit that Example 143 of the specification further 
discloses that, "(amplification is associated with overexpression of the gene product, indicating 
that the polypeptides are useful targets for therapeutic intervention in certain cancers such as 
lung, colon, breast and other cancers and diagnostic determination of the presence of those 
cancers" (Emphasis added). Besides, Appellants have submitted ample evidence to show that, in 
general, if a gene is amplified in cancer, it is "more likely than not" that the corresponding 
mRNA will also be expressed at an elevated level. 

For instance, Appellants presented the articles by Orntoft et al., Hyman el al, and 

Pollack el al. (made of record in Appellants' Response filed January 18, 2005), who collectively 
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teach that in general, for most genes, DNA amplification increases mRNA expression . The 
results presented by Orntoft et al, Hyman et al., and Pollack et al. are based upon wide ranging 
analyses of a large number of tumor associated genes. Orntoft et al. studied transcript levels of 
5600 genes in malignant bladder cancers, many of which were linked to the gain or loss of 
chromosomal material, and found that in general (18 of 23 cases) chromosomal areas with more 
than 2-fold gain of DNA showed a corresponding increase in mRNA transcripts. Hyman et al. 
compared DNA copy numbers and mRNA expression of over 12,000 genes in breast cancer 
tumors and cell lines, and found that there was evidence of a prominent global influence of copy 
number changes on gene expression levels. In Pollack et ai, the authors profiled DNA copy 
number alteration across 6,691 mapped human genes in 44 predominantly advanced primary 
breast tumors and 10 breast cancer cell lines, and found that on average, a 2-fold change in DNA 
copy number was associated with a corresponding 1.5-fold change in mRNA levels. In 
summary, the evidence supports the Appellants' position that gene amplification is more likely 
than not predictive of increased mRNA and polypeptide levels. 

Second, Appellants have submitted over a hundred references, along with the 
Declarations of Dr. Paul Polakis and Dr. Randy Scott with their Responses filed on 
January 1 8, 2005 and August 1 1, 2006, which collectively teach that, in general, there is a 
correlation between mRNA levels and polypeptide levels . 

In their Response filed January 18, 2005, Appellants submitted a Declaration by Dr. 
Polakis, principal investigator of the Tumor Antigen Project of Genentech, Inc., the assignee of 
the present application, to show that mRNA expression correlates well with protein levels, in 
general. As Dr. Polakis explains, the primary focus of the microarray project was to identify 
tumor cell markers useful as targets for both the diagnosis and treatment of cancer in humans. 
The scientists working on the project extensively rely on results of microarray experiments in 
their effort to identify such markers. As Dr. Polakis explains, using microarray analysis, 
Genentech scientists have identified approximately 200 gene transcripts (mRNAs) that are 
present in human tumor cells at significantly higher levels than in corresponding normal human 
cells. To the date of the Declaration, they have generated antibodies that bind to about 30 of the 
tumor antigen proteins expressed from these differentially expressed gene transcripts and have 
used these antibodies to quantitatively determine the level of production of these tumor antigen 
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proteins in both human cancer cells and corresponding normal cells. Having compared the levels 
of mRNA and protein in both the tumor and normal cells analyzed, they found a very good 
correlation between mRNA and corresponding protein levels. Specifically, in approximately 
80% of their observations they have found that increases in the level of a particular mRNA 
correlates with changes in the level of protein expressed from that mRNA. While the proper 
legal standard is to show that the existence of correlation between mRNA and polypeptide levels 
is more likely than not, the showing of approximately 80% correlation for the molecules tested 
according to the Polakis Declaration greatly exceeds this legal standard. Based on these 
experimental data and his vast scientific experience of more than 20 years, Dr. Polakis states 
that, for human genes, increased mRNA levels typically correlate with an increase in abundance 
of the encoded protein. He further confirms that "it remains a central dogma in molecular 
biology that increased mRNA levels are predictive of corresponding increased levels of the 
encoded protein." Appellants respectfully point out that the Declaration by Dr. Polakis (Polakis 
II) presents evidentiary data in Exhibit B. Exhibit B of the Declaration identifies 28 gene 
transcripts out of 31 gene transcripts (i.e., greater than 90%) that showed good correlation 
between tumor mRNA and tumor protein levels. As Dr. Polakis 5 Declaration (Polakis 11) says 
"|a]s such, in the cases where we have been able to quantitatively measure both (i) mRNA and 
(ii) protein levels in both (i) tumor tissue and (ii) normal tissue, we have observed that in the vast 
majority of cases, there is a very strong correlation between increases in mRNA expression and 
increases in the level of protein encoded by that mRNA." 

Appellants have also submitted, with their Response filed on August 1 1, 2006, a 
Declaration by Dr. Randy Scott ("the Scott Declaration"). Dr. Scott was a co-founder of Incyte 
Pharmaceuticals, Inc., the world's first genomic information business, and is currently the 
Chairman and Chief Executive Officer of Genomic Health, Inc., a life sciences company located 
in Redwood City, California, which provides individualized information on the likelihood of 
disease recurrence and response to certain types of therapy using gene expression profiling. 
Based on his more than 1 5 years of personal experience with the DN A microarray technique and 
its various uses in the diagnostic and therapeutic fields, and his familiarity with the relevant art, 
Dr. Scott unequivocally confirms that, as a general rule, there is a good correlation between 
mRNA and protein levels in a particular tissue. 
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As stated in paragraph 8 of the Scott Declaration: 

DNA microarray analysis has been extensively used in drug development and in 

diagnosis of various diseases Due to its importance in drug discovery and in 

the field of diagnostics, microarray technology has not only become a laboratory 
mainstay but also created a world-wide market of over $600 million in the year of 
2005. A long line of companies, including Incyte, Affymetrix, Agilent, Applied 
Biosystems, and Amersham Biosciences, made microarray technology a core of 
their business. 

In paragraph 10 of his Declaration, Dr. Scott explains the reasons for the wide-spread use 

and impressive commercial success of this technique, stating: 

One reason for the success and wide-spread use of the DNA microarray 
technique, which has led to the emergence of a new industry, is that generally 
there is a good correlation between mRNA levels determined by microarray 
analysis and expression levels of the translated protein. Although there are some 
exceptions on an individual gene basis, it has been a consensus in the scientific 
community that elevated mRNA levels are good predictors of increased 
abundance of the corresponding translated proteins in a particular tissue. 
Therefore, diagnostic markers and drug candidates can be readily and efficiently 
screened and identified using this technique, without the need to directly measure 
individual protein expression levels. (Emphasis added). 

The Declaration, which is based on Dr. Scott's unparalleled experience with both the 
microarray technique and its industrial and clinical applications, supports Appellants' position 
that the microarray technology is not only mature, reliable and well-accepted in the art, but also 
has been extensively used in drug development and in diagnosis of various diseases and 
produced enormous commercial success. Therefore, if a gene, such as the gene encoding the 
PRO 1788 polypeptide, has been identified to be over-expressed in a certain disease, such as 
colon cancer, it is more likely than not that the protein product is also overexpressed in the 
disease. 

Thus, taken together, all of the submitted evidence supports Appellants' position that 
gene amplification is more likely than not predictive of increased mRNA and polypeptide levels. 

Thus, the Examiner appears to disregard the ample evidence provided in the above 
referenced articles based on misinterpretations of their teachings. Appellants submit that in fact, 
these articles lend significant support that for an amplified gene, it is more likely than not that the 
protein will also be overexpressed and would be viewed as reasonable and credible by one of 
ordinary skill in the art. The "more likely than not" standard is a much lower standard than a 
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"necessary" correlation or "accurate" prediction, and is clearly met in the claimed invention. 
Moreover, the Examiner has not cited any evidence or advanced any arguments as to why 
Appellants' statement of overexpression of protein would not be credible. Accordingly, this 
point is believed to be moot. 

The Examiner has further asserted that the Hyman reference teaches "[I] ess than half 
(44%) of highly amplified genes showed mRNA overexpression (abstract). " (Page 6 of the Final 
Office Action mailed March 10, 2008). 

Appellants submit the Examiner's assertion is not consistent with the interpretation 
Hyman ei al. themselves place on their data, stating that, "The results illustrate a considerable 
influence of copy number on gene expression patterns." (page 6242. col. 1; emphasis added). 
In the more detailed discussion of their results, Hyman et al teach that "[u]p to 44% of the 
highly amplified transcripts (CGH ratio, >2.5) were overexpressed (i.e., belonged to the global 
upper 7% of expression ratios) compared with only 6% for genes with normal copy number." 
(See page 6242, col. 1 ; emphasis added). These details make it clear that Hyman et al. set a 
highly restrictive standard for considering a gene to be overexpressed; yet almost half of all 
highly amplified transcripts met even this highly restrictive standard . Therefore, the analysis 
performed by Hyman et al. clearly shows that "it is more likely than not" that a gene which is 
amplified in tumor cells will have increased gene expression. 

As stated above, the Orntoft et al, Hyman et al, and Pollack et al. articles were 
submitted to support the correlation between gene amplification and mRNA levels, which 
according to the Examiner is the sole basis of the maintained rejections. With regard to the 
correlation between mRNA expression and protein levels, Appellants previously submitted a 
Declaration by Dr. Polakis, principal investigator of the Tumor Antigen Project of Genentech. 
Inc., the assignee of the present application, along with over 100 supporting references (made of 
record in the Preliminary Amendment of March 9, 2007), to show that mRNA expression 
correlates well with protein levels, in general . 

Even if a prima facie ease of lack of utility has been established, it should be 
withdrawn on consideration of the totality of evidence 

Even if one assumes arguendo that it is more likely than not that there is no correlation 

between gene amplification and increased mRNA/protein expression, which Appellants submit is 
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not true, a polypeptide encoded by a gene that is amplified in cancer would still have a specific, 

substantial, and credible utility. In support, Appellants respectfully draw the Board's attention to 

page 2 of the Declaration of Dr. Avi Ashkenazi (submitted with the Response filed January 1 8, 

2005) which explains that, 

even when amplification of a cancer marker gene does not result in significant 
over-expression of the corresponding gene product, this very absence of gene 
product over-expression still provides significant information for cancer diagnosis, 
and treatment. Thus, if over-expression of the gene product does not parallel gene 
amplification in certain tumor types but does so in others, then parallel monitoring 
of gene amplification and gene product over-expression enables more accurate 
tumor classification and hence better determination of suitable therapy. In 
addition, absence of over-expression is crucial information for the practicing 
clinician. If a gene is amplified but the corresponding gene product is not over- 
expressed, the clinician accordingly will decide not to treat a patient with agents 
that target that gene product. 

Appellants thus submit that simultaneous testing of gene amplification and gene product 
over-expression enables more accurate tumor classification, even if the gene-product, the protein, 
is not over-expressed. This leads to better determination of a suitable therapy. Further, as 
explained in Dr. Ashkenazi's Declaration, absence of over-expression of the protein itself is 
crucial information for the practicing clinician. If a gene is amplified in a tumor, but the 
corresponding gene product is not over-expressed, the clinician will decide not to treat a patient 
with agents that target that gene product. This not only saves money, but also has the benefit that 
the patient can avoid exposure to the side effects associated with such agents. 

This utility is further supported by the teachings of the article by Hanna and Mornin. 
(Pathology Associates Medical Laboratories, August (1999); submitted with the Response filed 
January 1 8, 2005). The article teaches that the HER-2/neu gene has been shown to be amplified 
and/or over-expressed in 10%-30% of invasive breast cancers and in 40%-60% of intraductal 
breast carcinomas. Further, the article teaches that diagnosis of breast cancer includes testing 
both the amplification of the HER-2/neu gene (by FISH) as well as the over-expression of the 
HER-2/neu gene product (by IHC). Even when the protein is not over-expressed, the assay 
relying on both tests leads to a more accurate classification of the cancer and a more effective 
treatment of it. 
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Hanna ei al. clearly state that gene amplification (as measured by FISH) and polypeptide 
expression (as measured by immunohistochemistry, IHC) are well correlated ("in general, FISH 
and IHC results correlate well" (Hanna ei al. p. 1, col. 2)). It is only a subset of tumors which 
show discordant results. Thus Hanna et al. support Appellants' position that it is more likely 
than not that gene amplification correlates with increased polypeptide expression. 

Appellants have clearly shown that the gene encoding the PRO 1788 polypeptide is 
amplified in at least eight colon tumors. Therefore, the PRO 1788 gene, similar to the HER-2/neu 
gene disclosed in Hanna et al, is a tumor associated gene. Furthermore, as discussed above, in 
the majority of amplified genes, the teachings in the art overwhelmingly show that gene 
amplification influences gene expression at the mRNA and protein levels.. Therefore, one of skill 
in the art would reasonably expect in this instance, based on the amplification data for the 
PRO 1788. gene, that the PRO 1788 polypeptide is concomitantly overexpressed. 

However, even if gene amplification does not result in overexpression of the gene 
product (i.e., the protein) an analysis of the expression of the protein is useful in determining the 
course of treatment, as supported by the Ashkenazi Declaration and the Hanna paper. The 
Examiner asserts that "there is no evidence as to whether the gene products (such as the 
polypeptide) are over-expressed or not." (Page 4 of the instant Office Action). The Examiner 
appears to view the testing described in the Ashkenazi Declaration and the Hanna paper as 
experiments involving further characterization of the PRO 1788 polypeptide itself. In fact, such 
testing is for the purpose of characterizing not the PR01788 polypeptide, but the tumors in 
which the gene encoding PRO 1788 is amplified. The PRO 1788 polypeptides are therefore 
useful in tumor categorization, the results of which become an important tool in the hands of a 
physician enabling the selection of a treatment modality that holds the most promise for the 
successful treatment of a patient 

For the reasons given above, Appellants respectfully submit that the present specification 
clearly describes, details and provides a patentable utility for the claimed invention. 
Accordingly, Appellants respectfully request reconsideration and reversal of the rejections of 
Claims 28-35 and 38-40 under 35 U.S.C. §101. 

Thus, based on the asserted utility for PRO 1788 in the diagnosis of selected lung and 
colon tumors, the reduction to practice of the instantly claimed protein sequence of SEQ ID 
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NO:77 in the present application, the disclosure of the step-by-step protocols for making 
chimeric PRO polypeptides, including those wherein the heterologous polypeptide is an epitope 
tag or an Fc region of an immunoglobulin in the specification, the disclosure of a step-by-step 
protocol for making and expressing PRO 1788 in appropriate host cells, the step-by-step protocol 
for the preparation, isolation and detection of monoclonal, polyclonal and other types of 
antibodies against the PRO 1788 protein in the specification (pages 372-380 and Examples 1 32- 
]33) and the disclosure of the gene amplification assay in Example 143, the skilled artisan would 
know exactly how to make and use the claimed polypeptide and its antibodies for the diagnosis 
of lung and colon cancers. Appellants submit that based on the detailed information presented in 
the specification and the advanced state of the art in oncology, the skilled artisan would have 
found such testing routine and not 'undue 5 . 

Therefore, Appellants respectfully request reconsideration and reversal of this 
outstanding rejections under 35 U.S.C. §101 and §1 12, First Paragraph to Claims 28-35 and 38- 
40. 

ISSUE MI: Claims 28-33 and 39-40 satisfy the written description requirement of 35 
U.S.C. $112, First Paragraph 

Claims 28-33 and 39-40 stand rejected under 35 U.S.C. §112, first paragraph as allegedly 

lacking adequate written description. In particular, the Examiner asserts that "a recitation related 

to DNA does not reasonably constitute a 'functional limitation' for the claimed polypeptides." 

The Examiner further asserts that Appellants have not described "a representative number of 

species that have 80-99% homology to SEQ ID NO:397, such that it is clear that they were in 

possession of a genus of polypeptides functionally similar to SEQ ID NO:397." (Pages 10-11 of 

the Office Action mailed May 10, 2005). 

Claim 33 

Appellants respectfully submit that Claim 33, directed to the full-length polypeptide of 

SEQ ID NO:397, with or without its signal peptide, meets the written description requirement 

under 35 U.S.C. §112, first paragraph. The Examiner has acknowledged that isolated 

polypeptides comprising the sequence set forth in SEQ ID NO:397 meet the written description 

provision of 35 U.S.C. §112, first paragraph. (Page 7 of the Office Action mailed September 16, 

2004). Figure 232 of the specification discloses that the signal peptide of SEQ ID NO:397 
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comprises amino acid residues 1-16. Thus the specification has also described the amino acid 
sequence of the polypeptide of SEQ ID NO:397 lacking its associated signal peptide. 
Accordingly, the written description rejection does not apply to Claim 33 which does not recite 
any variants of SEQ ID NO:397 . 

Claims 28-32 and 39-40 

Coupled with the general knowledge available in the art at the time of the invention, 
Appellants submit that the specification provides ample written support for the claimed 
polypeptides. Thus, based on the high percentage of sequence identity, one skilled in the art 
would have known at the time of the invention that the Appellants had possession of the claimed 
polypeptides. 

A. The Legal Test for Written Description 

The well-established test for sufficiency of support under the written description 
requirement of 35 U.S.C. §112, first paragraph is "whether the disclosure of the application as 
originally filed reasonably conveys to the artisan that the inventor had possession at that time of 
the later claimed subject matter, rather than the presence or absence of literal support in the 
specification for the claim language. "20* 21 jh Q adequacy of written description support is a 
factual issue and is to be determined on a case-by-case basis. 22 The factual determination in a 
written description analysis depends on the nature of the invention and the amount of knowledge 
imparted to those skilled in the art by the disclosure. 23' 24 

In Environmental Designs, Ltd. v. Union Oil Co. ,25 s the Federal Circuit held, "Factors 
that may be considered in determining level of ordinary skill in the art include (1) the educational 

20 InreKaslow,lQlV26 1366, 1 374, 2 12 U.S.P.Q. 1089, 1096 (Fed. Cir. 1983). 

21 See also Vas-Cath, Inc. v. Mahurkar, 935 F.2d at 1563, 19 U.S.P.Q.2d at 1 1 16 (Fed. Cir. 1991). 

22 See e.g., Vas-Cath, 935 F.2d at 1563; 19 U.S.P.Q.2d at 1 1 16. 

23 Union Oil v. Atlantic Richfield Co., 208 F.2d 989, 996 (Fed. Cir. 2000). 

24 See also M.P.IZ.P. §2163 11(A). 

25 713 F.2d 693, 696, 218 U.S.P.Q. 865, 868 (Fed. Cir. 1983), cert, denied, 464 U.S. 1043 (1984). 
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level of the inventor; (2) type of problems encountered in the art; (3) prior art solutions to those 
problems; (4) rapidity with which innovations are made; (5) sophistication of the technology; 

and (6) educational level of active workers in the field." (Emphasis added). 26 Further, The 
"hypothetical 'person having ordinary skill in the art' to which the claimed subject matter 
pertains would, of necessity have the capability of understanding the scientific and enginecrinR 

principles applicable to the pertinent art .27. 28 

B. The Disclosure Provides Sufficient Written Description for the Claimed 
Invention 

Appellants respectfully submit that the instant specification evidences the actual 
reduction to practice of the amino acid sequence of SEQ ID NO:397. Thus, the genus of 
polypeptides with at least 80% sequence identity to SEQ ID NO:397 5 would meet the 
requirement of 35 U.S.C. §112, first paragraph, as providing adequate written description. 

Appellants respectfully submit that the instant claims are similar to the exemplary claim 
in Example 10 of the revised Training Manual on Written Description Guidelines issued by the 
U.S. Patent Office. 

Example 10 of the Training Manual clearly states that the protein variants meet the 
requirements of 35 U.S.C. §112, first paragraph, as providing adequate written description for 
the claimed invention even if the specification contemplates but does not exemplify variants of 
the protein if: (1) the procedures for making such variant proteins is routine in the art, (2) the 
specification does not describe the complete structure or physical properties of the variants, 
although those skilled in the art would expect members of the genus to have properties similar to 
those of the reference sequence because of high degree of structural similarity, and (3) the 
variant proteins of the genus possess a significant degree of partial structure (see Claim 2 of 
Example 10). 



26 See also M.PM.P. §2141.03. 

27 Ex parte Hiyamizu, 10 U.S.P.Q.2d 1393, 1394 (Bd. Pat. App. & Inter. 1988) (emphasis added). 

28 See also M.PM.P. §2141.03. 
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Appellants submit that all the requirements in Example 10 are met for the variant 
polypeptides of Claims . In particular, Claims 28-32 and 39-40 require that the variant 
polypeptide of PRO 1788 share a high sequence identity to SEQ ID NO:397. In addition, the 
procedures of making variant polypeptide of SEQ ID NO:397 are well-known in the art and 
described in detail in the specification. The instant specification includes extensive step-by-step 
guidance in the specification on how to make and prepare nucleic acids where the polypeptides 
have 80% to 99% identity to the polypeptide of SEQ ID NO: 397. For instance, the specification 
describes methods for the determination of percent identity between two amino acid sequences. 
In fact, the specification teaches specific parameters to be associated with the term "percent 
identity" as applied to the present invention. The specification further provides detailed 
guidance as to changes that may be made to a PRO polypeptide without adversely affecting its 
activity. This guidance includes a listing of exemplary and preferred substitutions for each of the 
twenty naturally occurring amino acids (Table 6). Accordingly, one of skill in the art could 
identify whether a variant PRO 1788 sequence falls within the parameters of the claimed 
invention. Once such an amino acid sequence is identified, the specification sets forth methods 
for making the amino acid sequences and methods of preparing the PRO polypeptides. 
Appellants claim only those polypeptides which meet the stated guidelines. 

Therefore, Appellants submit that the specification provides ample guidance such that 
one of skilled in the art would know that Appellants possessed the invention as claimed in the 
instant claims, at the time of filing of the application. Accordingly, Appellants respectfully 
request reconsideration and reversal of this outstanding rejection under 35 U.S.C. §112, first 
paragraph. Accordingly, Appellants respectfully request reconsideration and reversal of the 
written description rejection of Claims 28-32 and 39-40 under 35 U.S.C. §112, first paragraph. 
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CONCLUSION 



For the reasons given above, Appellants submit that present specification clearly 
describes, details and provides a patentable utility for the claimed invention. Moreover, it is 
respectfully submitted that based upon this disclosed patentable utility, the present specification 
clearly teaches "how to use" the presently claimed polypeptide. As such, Appellants respectfully 
request reconsideration and reversal of the outstanding rejection of Claims 28-35 and 38-40. 

The Commissioner is authorized to charge any fees which may be required, including 



extension fees, or credit any overpayment to Deposit Account No. 50-4634 (referencing 
Attorney's Docket No. 123851-181898 (GNE-2830-P1C42). 



Goodwin Procter LLP 

135 Commonwealth Drive 
Menlo Park, CA 94025 
T: 650.752.3100 
F: 650.853.1038 



Respectfully submitted, 



Date: December ^,2008 
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VIII. CLAIMS APPENDIX 

Claims on Appeal 



28. An isolated polypeptide having at least 80% amino acid sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:397; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:397, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203480; 

29. The isolated polypeptide of Claim 28 having at least 85% amino acid sequence 
identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:397; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:397 5 lacking its 
associated signal peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203480; 

30. The isolated polypeptide of Claim 28 having at least 90% amino acid sequence 
identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:397; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:397 5 lacking its 
associated signal peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203480; 

3 1 . The isolated polypeptide of Claim 28 having at least 95% amino acid sequence 
identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:397; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:397 5 lacking its 
associated signal peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203480; 
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32. The isolated polypeptide of Claim 28 having at least 99% amino acid sequence 
identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:397; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:397, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203480; 

33. An isolated polypeptide comprising: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:397; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:397 ? lacking its 
associated signal peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203480. 

34. The isolated polypeptide of Claim 33 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO:397. 

35. The isolated polypeptide of Claim 33 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO:397, lacking its associated signal peptide. 

36. The isolated polypeptide of Claim 33 comprising the amino acid sequence of the 
extracellular domain of the polypeptide of SEQ ID NO:397. 

38. The isolated polypeptide of Claim 33 comprising the amino acid sequence of the 
polypeptide encoded by the full-length coding sequence of the cDNA deposited under ATCC 
accession number 203480. 

39. A chimeric polypeptide comprising a polypeptide according to Claim 28 fused to 
a heterologous polypeptide. 

40. The chimeric polypeptide of Claim 39, wherein said heterologous polypeptide is 
an epitope tag or an Fc region of an immunoglobulin. 
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IX. EVIDENCE APPENDIX 



1. Declaration of Audrey Goddard, Ph.D. under 35, C.F.R. §1.132, with attached Exhibits A-G: 

A. Curriculum Vitae of Audrey D. Goddard, Ph.D. 

B. Higuchi, R. et aL, "Simultaneous amplification and detection of specific 
DNA sequences, " Biotechnology 10:413-417 (1992). 

C. Livak, K.J., et aL, "Oligonucleotides with fluorescent dyes at opposite 
ends provide a quenched probe system useful for detecting PGR product 
and nucleic acid hybridization," PCR Methods AppL 4:357-362 (1995). 

D. Heid, C.A. et aL, "Real time quantitative PCR," Genome Res. 6:986-994 
(1996). 

E. Pennica, D. et aL, "WISP genes are members of the connective tissue 
growth factor family that are up-regulated in Wnt-1 -transformed cells and 
aberrantly expressed in human lung tumors," Proc. Natl. Acad Sci. USA 
95:14717-14722 (1998). 

F. Pitti, R.M. et aL, "Genomic amplification of a decoy receptor for Fas 
ligand in lung and lung cancer," Nature 396:699-703 (1998). 

G. Bieche, I. et aL, "Novel approach to quantitative polymerase chain 
reaction using real-time detection: Application to the detection of gene 
amplification in breast cancer," Int. J. Cancer 78:661-666 (1998). 

2. Declaration of Avi Ashkenazi, Ph.D. under 35 C.F.R. §1.132, with attached Exhibit A 
(Curriculum Vitae). 

3. Declaration of Paul Polakis, Ph.D. under 37 C.F.R. §1.132. 

4. Hyman, E., et aL, "Impact of DNA Amplification on Gene Expression Patterns in Breast 
Cancer," Cancer Research 62:6240-6245 (2002). 

5. Pollack, J.R., et aL, "Microarray Analysis Reveals a Major Direct Role of DNA Copy 
Number Alteration in the Transcriptional Program of Human Breast Tumors," Proc. Natl. 
Acad. Sci. USA 99:12963-12968 (2002). 

6. Manna et aL, "HER-2/neu Breast Cancer Predictive Testing," Pathology Associates 
Medical Laboratories (1999). 

7. Orntoft, T.F., et aL Molecular & Cellular Proteomics - 1 :37-45 (2002). 

8. Hu, Y. et al., "Analysis of genomic and proteomic data using advanced literature mining," 
Journal ofProteome Research 2:405-412 (2003). 



9. Pennica. D. et aL, "WISP genes are members of the connective tissue growth factor 
family that are up-regulated in Wnt- 1 -transformed cells and aberrantly expressed in 
human colon tumors," Proc. Natl. Acad. Sci. USA 83: 4049-52 (1986) 
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10. Konopka et ai, "Variable Expression of the Translocated c-abl oncogene in Philadelphia- 
chromosome-positive B-lymphoid cell lines from chronic myelogenous leukemia 
patients" Proc. Natl. Acad Sci. USA 83: 4049-52, (1986). 

1 1 . Chen et aL, "Discordant Proein and mRNA Expression in Lung Adenocarcinomas", 304 
Molecular & Cellular Proteomics 1.4, The American Society for Biochemistry and 
Molecular Biology, Inc., (2002). 

12. Lewin et aL, "VI Genes", Oxford University Press, (1997). 

13. Futcher et al., "A Sampling of the Yeast Proteome", (1999). 

Items 1-7 were submitted with Appellants' Response filed January 18, 2005, and were 
considered by the Examiner as indicated in the Final Office Action mailed April 28, 2005. 

Item 8 was made of record by the Examiner in the Office Action mailed May 10, 2005. 

Items 9-10 were made of record by the Examiner in the Office Action mailed March 10, 2008. 

Items 11-13 were made of record by the Examiner in the Office Action mailed March 10, 2008. 
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X. RELATED PROCEEDINGS APPENDIX 

None. 



LIBC/3463872.1 
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PATENT 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 




DECLARATION OF AUDREY D. GODDARD, Ph.D UNDER 37 C.F.R. § 1.132 

Assistant Commissioner of Patents 
Washington, D.C. 2023 1 



Sir: 

1, Audrey D. Goddard, Ph.D. do hereby declare and say as follows: 

1 . I am a Senior Clinical Scientist at the Experimental Medicine/BioOncology, Medical 
Affairs Department of Genentech, Inc., South San Francisco, California 94080. 

2. Between 1 993 and 200 1 , 1 headed the DNA Sequencing Laboratory at the Molecular 
Biology Department of Genentech, Inc. During this time, my responsibilities included the 
identification and characterization of genes contributing to the oncogenic process, and determination 
of the chromosomal localization of novel genes. 

3 . My scientific Curriculum Vitae, including my list of publications, is attached to and 
forms part of this Declaration (Exhibit A). 



) 



Serial No.: * 
Filed: * 

4. I am familiar with a variety of techniques known in the art for detecting and 
quantifying the amplification of oncogenes in cancer, including the quantitative TaqManPCR (i.e., 
"gene amplification") assay described in the above captioned patent application. 

5. The TaqMan PCR assay is described, for example, in the following scientific 
publications: Higuchi et al, Biotechnology 10:413-417 (1992) (Exhibit B); Livak et a/., PCR 
Methods AppL 4:357-362 (1995) (Exhibit C) and Heid et al, Genome Res. 6:986-994 (1996) • 
(Exhibit D). Briefly, the assay is based on the principle that successful PCR yields a fluorescent 
signal due to Taq DNA polymerase-mediated exonuclease digestion of a fluorescently labeled 
oligonucleotide that is homologous to a sequence between two PCR primers. The extent of 
digestion depends directly on the amount of PCR, and can be quantified accurately by measuring the 
increment in fluorescence that results from decreased energy transfer. This is an extremely sensitive 
technique, which allows detection in the exponential phase of the PCR reaction and, as a result, 
leads to accurate determination of gene copy number. 

6. The quantitative fluorescent TaqMan PCR assay has been extensively and . 
successfully used to characterize genes involved in cancer development and progression. 
Amplification of protooncogenes has been studied in a variety of human tumors, and is widely 
considered as having etiological, diagnostic and prognostic significance. This use of the quantitative 
TaqMan PCR assay is exemplified by the following scientific publications: Pennica et al, Proc. 
Natl. Acad. Sci. USA . 95(25): 147 17- 14722 (1998) (Exhibit E); Pitti et at, Nature 
396(67 12);699-703 (1998) (Exhibit F) and Bieche et g/., Int. J. Cancer 78:661-666 (1998) (Exhibit 
G), the first two of which I am co-author. In particular, Pennica et al have used the quantitative 
TaqMan PCR assay to study relative gene amplification of WISP and c-myc in various cell lines, 
colorectal tumors and normal mucosa. Pitti et al studied the genomic amplification of a decoy 

. receptor for Fas ligand in lung and colon cancer, using the quantitative TaqMan PCR assay. Bieche 
et al used the assay to study gene amplification in breast cancer. 



Serial No.: * 
Filed: * 



7. It is my personal experience that the quantitative TaqMan PCR technique is 
technically sensitive enough to detect at least a 2-fold increase in gene copy number relative to 
control. It is further my considered scientific opinion that an at least 2-fold increase in gene copy 
number in a tumor tissue sample relative to a normal (i.e., non-tumor) sample is significant and 
useful in that the detected increase in gene copy number in the tumor sample relative to the normal 
sample serves as a basis for using relative gene copy number as quantitated by the TaqMan PCR 
technique as a diagnostic marker for the presence or absence of tumor in a tissue sample of unknown • 
pathology. Accordingly, a gene identified as being amplified at least 2-fold by the quantitative 
TaqMan PCR assay in a tumor sample relative to a normal sample is useful as a marker for the 
diagnosis of cancer, for monitoring cancer development and/or for measuring the efficacy of cancer, 
therapy. 

8. I declare further that all statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true. I declare that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code, and that such willful false statements may jeopardize the validity of the application or any 
patent issuing thereoa 
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San Francisco, CA t 94131 

415.841.9154 

415.819.2247 (mobile) 
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PROFESSIONAL EXPERIENCE 

Gehentech, Inc. 1993-present 
South San Francisco, CA 

2001 - present Senior Clinical Scientist 

Experimental Medicine / BioOncology, Medical Affairs 

Responsibilities: 

• Companion diagnostic oncology products 

• Acquisition of clinical samples from Genentech's clinical trials for translational research 

• Translational research using clinical specimen and data for drug development and 
diagnostics 

• Member of Development Science Review Committee, Diagnostic Oversight Team, 21 CFR 
Part 11 Subteam 

Interests: 

• Ethical and legal implications of experiments with clinical specimens and data 

• Application of pharmacogenomics in clinical trials 



1998 - 2001 Senior Scientist 

Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities: 

• Management of a laboratory of up to nineteen -including postdoctoral fellow, associate 
scientist senior research associate and research assistants/associate levels 

• Management of a $750K budget 

• DNA sequencing core facility supporting a 350+ person research facility. 

• DNA sequencing for high throughput gene discovery, - ESTs, cDNAs, and constructs 

• Genomic sequence analysis and gene identification 

• DNA sequence and primary protein analysis 

Research: 

• Chromosomal localization of novel genes 

• Identification and characterization of genes contributing to the oncogenic process 

• Identification and characterization of genes contributing to inflammatory diseases 

• Design and development of schemes for high throughput genomic DNA sequence analysis 

• Candidate gene prediction and evaluation 
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1993-1998 



Scientist 



Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities 

• DNA sequencing core facility supporting a 350+ person research facility 

• Assumed responsibility for a pre-existing team of five technicians and expanded the group 
into fifteen, introducing a level of middle management and additional areas of research 

• Participated in the development of the basic plan for high throughput secreted protein 
discovery program - sequencing strategies, data analysis and tracking, database design 

• High throughput EST and cDNA sequencing for new gene identification, 

• Design and implementation of analysis tools required for high throughput gene identification. 

• Chromosomal localization of genes encoding novel secreted proteins. 

Research: 

• Genomic sequence scanning for new gene discovery. 

• Development of signal peptide selection methods. 

• Evaluation of candidate disease genes. 

• Growth hormone receptor gene SNPs in children with Idiopathic short stature 

Imperial Cancer Research Fund 1989-1992 
London, UK with Dr. Ellen Solomon 

6/89 -12/92 Postdoctoral Fellow 

• Cloning and characterization of the genes fused at the acute promyelocyte leukemia 
translocation breakpoints on chromosomes 17 and 15. 

• Prepared a successfully funded European Union multi-center grant application 

McMaster University 1983 
Hamilton, Ontario, Canada with Dr. G. D. Sweeney 

5/83 - 8/83: NSERC Summer Student 

• In vitro metabolism of p-naphthoflavone in C57BI/6J and DBA mice 



EDUCATION 



Ph.D. 



University of Toronto 
Toronto, Ontario, Canada. 
Department of Medical 
Biophysics. 



"Phenotypic and genotypic effects of mutations in 
the human retinoblastoma gene." 
Supervisor: Dr. R. A. Phillips 



1989 



Honours B.Sc 

"The in vitro, metabolism of the cytochrome P-448 
inducer p-naphthoflavone in C57BL/6J mice." 
Supervisor: Dr. G. D. Sweeney 



McMaster University, 
Hamilton, Ontario, Canada. 
Department of Biochemistry 



1983 
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ACADEMIC AWARDS 



Imperial Cancer Research Fund Postdoctoral Fellowship 

Medical Research Council Studentship 

NSERC Undergraduate Summer Research Award • 

Society of Chemical Industry Merit Award (Hons. Bioehem.) 

Dr. Harry Lyman Hooker Scholarship 

J.LW. Gill Scholarship 

Business and Professional Women's Club Scholarship 
Wyerhauser Foundation Scholarship 



1989-1992 
1983-1988 
1983 



1983 



1981-1983 
1981-1982 
1980-1981 
1979-1980 



INVITED PRESENTATIONS 

Genentech's gene discovery pipeline: High throughput identification, cloning and 
characterization of novel genes. Functional Genomics: From Genome to Function, Litchfield 
Park, AZ, USA. October 2000 

High throughput identification, cloning and characterization of novel genes. G2K:Back to 
Science, Advances in Genome Biology and Technology I. Marco Island, FL, USA. February 



Quality control in DNA Sequencing: The use of Phred and Phrap. Bay Area Sequencing 
Users Meeting, Berkeley, CA, USA. April 1999 

High throughput secreted protein identification and cloning. Tenth International Genome 
Sequencing and Analysis Conference, Miami, FL, USA. September 1998 

The evolution of DNA sequencing: The Genentech. perspective. Bay Area Sequencing Users 
Meeting, Berkeley, CA, USA. May 1998 

Partial Growth Hormone Insensitivity: The role of GH-receptor mutations in Idiopathic Short 
Stature. Tenth Annual National Cooperative Growth Study Investigators Meeting, San 
Francisco, CA, USA October, 1996 

Growth hormone (GH) receptor defects are present in selected children with non-GH-deficient 
short stature: A molecular basis for partial GH-insensitivity. 76 th Annual Meeting of The 
Endocrine Society, Anaheim, CA, USA. June 1994 

A previously uncharacterized gene, myl, is fused to the retinoic acid receptor alpha gene in 
acute promyelocytic leukemia. XV International Association for Comparative Research on 
Leukemia and Related Disease, Padua, Italy. October 1991 



2000 



( 
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PATENTS 

Goddard A, Godowski P J, Gurney AL NL2 Tie ligand homologue polypeptide. Patent 
Number: 6,455,496. Date of Patent: Sept. 24, 2002. 

Goddard A, Godowski PJ and Gurney AL NL3 Tie ligand homologue nucleic acids. Patent 
Number: 6,426,218. Date of Patent: July 30, 2002. 

Godowski P, Gurney A, Hillan KJ, Botstein D, Goddard A, Roy M, Ferrara N, Tumas D, 
Schwall R. NL4 Tie ligand homologue nucleic acid. Patent Number: 6,4137,770. Date of 
Patent: July 2, 2002. 

Ashkenazi A, Fong S, Goddard A, Gurney AL, Napier MA, Tumas D, Wood Wl. Nucleic acid 
encoding A-33 related antigen poly peptides. Patent Number: 6,410,708. Date of Patent" 
Jun. 25, 2002. 

Botstein DA, Cohen RL, Goddard AD, Gurney AL, Hillan KJ, Lawrence DA, Levine AJ, 
Pennica D, Roy MA and Wood Wl. WISP polypeptides and nucleic acids encoding same! 
Patent Number: 6,387,657. Date of Patent: May 14, 2002. 

Goddard A, Godowski PJ and Gurney AL. Tie ligands. Patent Number: 6,372,491. Date of 
Patent: April 16, 2002. 

Godowski PJ, Gurney AL, Goddard A and Hillan K. TIE ligand homologue antibody. Patent 
Number: 6,350,450. Date of Patent: Feb. 26, 2002. 

Fong S, Ferrara N, Goddard A, Godowski PJ, Gurney AL, Hillan K and Williams PM. Tie 
receptor tyrosine kinase ligand homologues. Patent Number: 6,348,351. Date of Patent: 
Feb. 19, 2002. 

Goddard A, Godowski PJ and Gurney AL. Ligand homologues. Patent Number: 6,348,350. 
Date of Patent: Feb. 19, 2002. 

Attie KM, Carlsson LMS, Gesundheit N and Goddard A. Treatment of partial growth 
hormone insensitivity syndrome. Patent Number. 6,207,640. Date of Patent: March 27, 
2001. . 

Fong S, Ferrara N, Goddard A, Godowski PJ, Gurney AL, Hillan K and Williams PM. Nucleic 
acids encoding NL-3. Patent Number: 6,074,873. Date of Patent: June 13, 2000 

Attie K, Carlsson LMS, Gesunheit N and Goddard A.. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,824,642. Date of Patent: October 20, 1998 

Attie K, Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,646,113. Date of Patent: July 8, 1997 

Multiple additional provisional applications filed 
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PUBLICATIONS 

Seshasayee D, Dowd P, Gu Q f Erickson S, Goddard AD Comparative sequence analysis of 
the HER2 locus in mouse and man. Manuscript in preparation. 

Abuzzahab MJ, Goddard A, Grigorescu F, Lautier C, Smith RJ and Chernausek SD. Human 
IGF-1 receptor mutations resulting in pre- and post-natal growth retardation. Manuscript in 
preparation. 

Aggarwal S, Xie, M-H, Foster J, Frantz G, Stinson J, Corpuz RT, Simmons L, Hillan K, 
Yansura DG, Vandlen RL, Goddard AD and Gurney AL FHFR, a novel receptor for the 
fibroblast growth factors. Manuscript submitted. 

Adams SH, Chui C, Schilbach SL, Yu XX, Goddard AD, Grimaldi JC, Lee J, Dowd P, Colman 
S., Lewin DA^ (2001) BFIT, a unique acyl-CoA thioesterase induced in thermogenic brown 
adipose tissue: Cloning, organization of the human gene, and assessment of a potential link 
to obesity. Biochemical Journal 360: 135-142. 

Lee J. Ho WH. Maruoka M, Corpuz RT. Baldwin DT. Foster JS. Goddard AD. Yansura DG. 
Vandlen RL. Wood W I. Gurney AL. (2001) IL-17E, a novel proinflammatory ligand for the IL- 
17 receptor homolog IL-17RM. Journal of Biological Chemistry 276(2): 1660-1664. 

Xie M-H, Aggarwal S, Ho W-H, Foster J, Zhang Z, Stinson J, Wood Wl, Goddard AD and 
Gurney AL. (2000) Interieukin (IL)-22, a novel human cytokine that signals through the 
interferon-receptor related proteins CRF2-4 and IL-22R. Journal of Biological Chemistry 275: 
31335-31339. 

Weiss GA, Watanabe CK, Zhong A, Goddard A and Sidhu SS. (2000) Rapid mapping of 
protein functional epitopes by combinatorial alanine scanning. Proc. Natl. Acad. ScL USA 97: 
8950-8954. 

Guo S, Yamaguchi Y, Schilbach S, Wada T.;Lee J, Goddard A, French D , Handa H, 
Rosenthal A. (2000) A regulator of transcriptional elongation controls vertebrate neuronal . 
development. Nature 408: 366-369. 

Yan M, Wang L-C, Hymowitz SG, Schilbach S, Lee J, Goddard A, de Vos AM, Gao WQ, Dixit 
VM. (2000) Two-amino acid molecular switch in an epithelial morphogen that regulates 
binding to two distinct receptors. Science 290: 523-527. 

Sehl PD, Tai JTN, Hillan KJ, Brown LA, Goddard A, Yang R, Jin H and Lowe DG. (2000) 
Application of cDNA microarrays in determining molecular phenotype in cardiac growth, 
development, and response to injury. Circulation 101: 1990-1999. 

Guo S, Brush J, Teraoka H, Goddard A, Wilson SW, Mullins MC and Rosenthal A. (1999) 
Development of noradrenergic neurons in the zebrafish hindbrain requires BMP, FGF8, and 
the homeodomain protein soulless/Phox2A. Neuron 24: 555-566. 

Stone D, Murone, M, Luoh, S, Ye W, Armanini P, Gurney A, Phillips HS, Brush, J, Goddard 
A, de Sauvage FJ and Rosenthal A. (1999) Characterization of the human suppressor of 
fused; a negative regulator of the zinc-finger transcription factor Gli. J. Cell Sci. 112: 4437- 
4448. 

Xie M-H, Holcomb I, Deuel B, Dowd P t Huang A, Vagts A, Foster J, Liang J, Brush J, Gu Q, 
Hillan K, Goddard A and Gurney, A.L. (1999) FGF-19, a novel fibroblast growth factor with 
unique specificity for FGFR4. Cytokine 1 1 : 729-735. 
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Yan M, Lee J, Schilbach S, Goddard A and Dixit V. (1999) mE10, a novel caspase 
recruitment domain-containing proapoptotic molecule. J. Biol. Chem. 274(15): 10287-10292. 

Gurney AL, Marsters SA, Huang RM, Pitti RM, Mark DT, Baldwin DT, Gray AM, Dowd P, 
Brush J, Heldens S f Schow P, Goddard AD, Wood.WI, Baker KP, Godowski PJ and 
Ashkenazi A. (1999) Identification of a new member of the tumor necrosis factor family and its 
receptor, a human ortholog of mouse GITR. Current Biology 9(4): 215-218. 

Ridgway JBB, Ng E, Kern JA ,Lee J, Brush J, Goddard A and Carter P. (1999) Identification 
of a human anti-CD55 single-chain Fv by subtractive panning of a phage library using tumor 
and nontumor cell lines. Cancer Research 59: 2718-2723. 

Pitti RM, Marsters SA, Lawrence DA, Roy M, Kischkel FC, Dowd P, Huang A, Donahue CJ, 
Sherwood SW, Baldwin DT, Godowski PJ, Wood Wl, Gurney AL, Hillan KJ, Cohen RL, 
Goddard AD, Botstein D and Ashkenazi A. (1998) Genomic amplification of a decoy receptor 
for Fas ligand in lung and colon cancer. Nature 396(6712): 699-703. 

Pennica D, Swanson TA, Welsh JW, Roy MA, Lawrence DA, Lee J, Brush J, Taneyhill LA, 
Deuel B, Lew M, Watanabe C, Cohen RL, Melhem MF, Finley GG, Quirke P, Goddard AD, 
Hillan KJ, Gurney AL, Botstein D and Levine AJ. (1998) WISP genes are members of the 
connective tissue growth factor family that are up-regulated in wnt-1 -transformed cells and 
aberrantly expressed in human colon tumors. Proc. Natl. Acad. Sci. USA. 95(25): 14717- 
14722. 

Yang RB, Mark MR, Gray A, Huang A, Xie MH, Zhang M, Goddard A, Wood Wl, Gurney AL 
and Godowski PJ. (1998) Toll-like receptor-2 mediates lipopolysaccharide-induced cellular 
signalling. Nature 395(6699): 284-288. 

Merchant AM, Zhu Z, Yuan JQ, Goddard A, Adams CW, Presta LG and Carter P. (1998) An 
efficient route to human bispecific IgG. Nature Biotechnology 16(7): 677-681 . 

Marsters SA, Sheridan JP, Pitti RM, Brush J, Goddard A and Ashkenazi A. (1998) 
Identification of a ligand for the death-domain-containing receptor Apo3. Current Biology 8(9): 
525-528. 

Xie J, Murone M, Luoh SM, Ryan A, Gu Q, Zhang C, Bonifas JM, Lam CW, Hynes M, 
Goddard A, Rosenthal A, Epstein EH Jr. and de Sauvage FJ. (1998) Activating Smoothened 
mutations in sporadic basal-cell carcinoma. Nature. 391(6662): 90-92. 

Marsters SA, Sheridan JP, Pitti RM, Huang A, Skubatch M, Baldwin D, Yuan J, Gurney A, 
Goddard AD, Godowski P and Ashkenazi A. (1997) A novel receptor for Apo2L/TRAIL 
contains a truncated death domain. Current Biology. 7(12): 1003-1006. 

Hynes M, Stone DM, Dowd M, Pitts-Meek S, Goddard A, Gurney A and Rosenthal A. (1997) 
Control of cell pattern in the neural tube by the zinc finger transcription factor G/M. Neuron 
19:15-26. 

Sheridan JP, Marsters SA, Pitti RM, Gurney A., Skubatch M, Baldwin D, Ramakrishnan L, 
Gray CL, Baker K, Wood Wl, Goddard AD, Godowski P, and Ashkenazi A. (1997) Control of 
TRAIL-lnduced Apoptosis by a Family of Signaling and Decoy Receptors. Science 277 
(5327): 818-821. 
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Goddard AD, Dowd P, Chernausek S, Geffner M, Gertner J, Hintz R, Hopwood N, Kaplan S, 
Plotnick L, Rogol A, Rosenfield R, Saenger P, Mauras N, Hershkopf R f Angulo M and Attie, K. 
(1997) Partial growth hormone insensitivity: The role of growth hormone receptor mutations in 
idiopathic short stature. J. Pediatr. 131: S51-55. 

Klein RD f Sherman D t Ho WH, Stone D, Bennett GL, Moffat B t Vandlen R, Simmons L, Gu Q t 
Hongo JA, Devaux B t Poulseri K, Armanini M, Nozaki C, Asai N, Goddard A, Phillips H, 
Henderson CE, Takahashi M and Rosenthal A. (1997) A GPMinked protein that interacts with 
Ret to form a candidate neurturin receptor: Nature. 387(6634): 717-21. 

Stone DM, Hynes M, Armanini M, Swanson TA, Gu Q, Johnson RL, Scott MP, Pennica D, 
Goddard A, Phillips H, Noll M, Hooper JE, de Sauvage F and Rosenthal A. (1996) The 
tumour-suppressor gene patched encodes a candidate receptor for Sonic hedgehog. Nature 
334(6605): 129-34. 

Marsters SA, Sheridan JP, Donahue CJ, Pitti RM t Gray CL, Goddard AD, Bauer KD and 
Ashkenazi A. (1996) Apo-3, a new member of the tumor necrosis factor receptor family, 
contains a death domain and activates apoptosis and NF-kappa p. Current Biology 6(12): 
1669-76. 

Rothe M, Xiong J, Shu HB, Williamson K, Goddard A and Goeddel DV. (1996) l-TRAF is a 
novel TRAF-interacting. protein that regulates TRAF-mediated signal transduction. Proc. NatL 
Acad. Sci. USA 93: 8241-8246. 

Yang M, Luoh SM, Goddard A, Reilly D, Henzel W and Bass S. (1996) The bglX gene 
located at 47.8 min on the Escherichia coli chromosome encodes a periplasmic beta- 
glucosidase. Microbiology 142: 1659-65. 

Goddard AD and Black DM. (1996) Familial Cancer in Molecular Endocrinology of Cancer. 
Waxman, J. Ed. Cambridge University Press, Cambridge UK, pp.1 87-21 5. 

Treanor JJS, Goodman L, de Sauvage F, Stone DM, Poulson KT, Beck CD, Gray C, Armanini 
MP, Pollocks RA, Hefti F, Phillips HS, Goddard A, Moore MW, Buj-Bello A, Davis AM, Asai N, 
Takahashi M, Vandlen R, Henderson CE and Rosenthal A. (1996) Characterization of a 
receptor for GDNF. Nature 382: 80-83. 

Klein RD, Gu Q, Goddard A and Rosenthal A. (1996) Selection for genes encoding secreted 
proteins and receptors. Proc. NatL Acad. Sci. USA 93: 7108-7113. 

Winslow JW, Moran P, Valverde J, Shih A, Yuan JQ, Wong SC, Tsai SP, Goddard A, Henzel 
WJ, Hefti F and Caras I. (1995) Cloning of AL-1, a ligand for an Eph-related tyrosine kinase 
receptor involved in axon bundle formation. Neuron 14: 973-981 . 

Bennett BD, ZeiglerFC, Gu Q, Fendly B, Goddard AD, Gillett N and Matthews W. (1995) 
Molecular cloning of a ligand for the EPH-related receptor protein-tyrosine kinase Htk. Proc. 
Natl. Acad. Sci. USA 92: 1866-1870. 

Huang X, Yuang J, Goddard A, Foulis A, James RF, Lernmark A, Pujol-Borrell R, 
Rabinovitch A, Somoza N and Stewart TA. (1995) Interferon expression in the pancreases of 
patients with type I diabetes. Diabetes 44: 658-664. 

Goddard AD, Yuan JQ, Fairbairn L, Dexter M, Borrow J, Kozak C and Solomon E. (1995) 
Cloning of the murine homolog of the leukemia-associated PML gene. Mammalian Genome 
6: 732-737. 
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Goddard AD, Covello R; Luoh SM, Clackson T, Attie KM, Gesundheit N, Rundle AC, Wells 
JA, Carlsson LMTI and The Growth Hormone Insensitivity Study Group. (1995) Mutations of 
the growth hormone receptor in children with idiopathic short stature. N. Engl J Med 333' 
1093-1098. 

Kuo SS, Moran P, Gripp J, Armanini M, Phillips HS, Goddard A and Caras IW. (1994) 
Identification and characterization of Batk, a predominantly brain-specific non-receptor protein 
tyrosine kinase related to Csk. J. Neurosci. Res. 38: 705-715. 

Mark MR, Scadden DT, Wang Z, Gu Q, Goddard A and Godowski PJ. (1994) Rse, a novel 
receptor-type tyrosine kinase with homology to Axl/Ufo, is expressed at high levels in the 
brain. Journal of Biological Chemistry 269:1 0720-1 0728. 

Borrow J, Shipley J, Howe K, Kiely F, Goddard A, Sheer D, Srivastava A, Antony AC, 
Fbretos T, Mitelman F and Solomon E. (1994) Molecular analysis of simple variant 
translocations in acute promyelocytic leukemia. Genes Chromosomes Cancer 9: 234-243. 

Goddard AD and Solomon E. (1993) Genetics of Cancer. Adv. Hum. Genet 21: 321-376. 

Borrow J, Goddard AD, Gibbons B, Katz F, Swirsky D, Fioretos T, Dube I, Winfield DA, 
Kingston J, Hagemeijer A, Rees JKH, Lister AT and Solomon E. (1992) Diagnosis of acute 
promyelocytic leukemia by RT-PCR: Detection of PML-RARA and RARA-PML fusion 
transcripts. Br. J. Haematol. 82: 529-540. 

Goddard AD, Borrow J and Solomon E. (1992) A previously uncharacterized gene, PML, is 
fused to the retinoic acid receptor alpha gene in acute promyelocytic leukemia. Leukemia 6 
Suppl3: 117S-119S. 

Zhu X, Dunn JM, Goddard AD, Squire JA, Becker A, Phillips RA and Gallie BL (1992) 
Mechanisms of loss of heterozygosity in retinoblastoma. Cytogenet. Cell. Genet. 59: 248-252. 

Foulkes W, Goddard A. and Patel K. (1991) Retinoblastoma linked with Seascale [letter]. 
British Med. J. 302: 409. 

Goddard AD, Borrow J, Freemont PS and Solomon E. (1991) Characterization of a novel zinc 
finger gene disrupted by the t(15;17) in acute promyelocytic leukemia. Science 254:. 1371- 
1374. 

Solomon E, Borrow J and Goddard AD. (1991) Chromosomal aberrations in cancer. Science 
254: 1153-1160. 

Pajunen L, Jones TA, Goddard A, Sheer D, Solomon E, Pihlajaniemi T and Kivirikko Kl. 
(1991) Regional assignment of the human gene coding for a multifunctional peptide (P4HB) 
acting as the p-subunit of prolyl-4-hydroxylase and the enzyme protein disulfide isomerase to 
17q25. Cytogenet. Cell. Genet 56: 165-168. 

Borrow J, Black DM, Goddard AD, Yagle MK, Frischauf A.-M and Solomon E. (1991) 
Construction and regional localization of a Not\ linking library from human chromosome 17q. 
Genomics 10: 477-480. 

Borrow J, Goddard AD, Sheer D and Solomon E. (1990) Molecular analysis of acute 
promyelocytic leukemia breakpoint cluster region on chromosome 17. Science 249: 1577- 
1580. 
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Myers JC, Jones TA, Pohjolainen E-R, Kadri AS, Goddard AD, Sheer D, Solomon E and 
Pihlajaniemi T. (1990) Molecular cloning of 5(IV) collagen and assignment of the gene to the 
region of the region of the X-chromosome containing the Alport Syndrome locus. Am. J. Hum. 
.Genet 46: 1024-1033. 

Gallie BL, Squire JA, Goddard A, Dunn JM, Canton M, Hinton D, Zhu X and Phillips RA. 
(1990) Mechanisms of oncogenesis in retinoblastoma. Lab. Invest. 62: 394-408. 

Goddard AD, Phillips RA, Greger V, Passarge E, Hopping W, Gallie BL and Horsthemke B. 
(1990) Use of the RB1 cDNA as a diagnostic probe in retinoblastoma families. Clinical 
Genetics 37: 117-126. 

Zhu XP, Dunn JM, Phillips RA, Goddard AD, Paton KE, Becker A and Gallie BL. (1989) 
G^rmline, but not somatic, mutations of the RB1 gene preferentially involve the paternal 
allele. Nature 340: .312-314. 

Gallie BL, Dunn JM, Goddard A, Becker A and Phillips RA. (1988) Identification of mutations 
in the putative retinoblastoma gene. In Molecular Biology of The Eve: Genes, Vision and 
Ocular Disease . UCLA Symposia on Molecular and Cellular Biology, New Series, Volume 88. 
J. Piatigorsky, T. Shinohara and P.S. Zelenka, Eds. Alan R. Liss, Inc., New York,. 1988, pp. 
427-436. 

Goddard AD, Balakier H, Canton M, Dunn J, Squire J, Reyes E, Becker A, Phillips RA and 
Gallie BL. (1988) Infrequent genomic rearrangement and normal expression of the putative 
RB1 gene in retinoblastoma tumors. Mol. Cell. Biol. 8: 2082-2088. 

Squire J, Dunn J, Goddard A, Hoffman T, Musarella M, Willard HF, Becker AJ, Gallie BL and 
Phillips RA. (1986) Cloning of the esterase D gene: A polymorphic gene probe closely linked 
to the retinoblastoma locus on chromosome 13. Proa Natl. Acad. Sci. USA 83: 6573-6577. 

Squire J, Goddard AD, Canton M, Becker A, Phillips RA and Gallie BL (1986) Tumour 
induction by the retinoblastoma mutation is independent of N-myc expression. Nature 322: 
555-557. 

Goddard AD, Heddle JA, Gallie BL and Phillips RA. (1985) Radiation sensitivity of fibroblasts 
of bilateral retinoblastoma patients as determined by micronucleus induction in vitro. Mutation 
Research 152: 31-38. 
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SIMULTANEOUS AMPLIFICATION AND DETECTION Of 
SPECIFIC DMA SEQUENCES 

Russell mguchi*, Gaviix DoHia^er 1 , P. Scan Walsh and Robert Griffith 

Roche Molecular System*. Inc., 1400 55rd St., EmeryvQk, CA 94608. , Chvon Corporation, 1400 53rd Sc, Emeryville, CA 
94608, '♦'Corresponding author, 



We have enhanced the polymerase chain 
reaction (PGR) such that specific DNA 
sequences can be detected without open- 
ing the reaction tube. This enhancement 
requires the addition of ethidium bromide 
(EtBr) to a PGR* Since the fluorescence of 
EtBr increases in the presence of double* 
stranded (ds) DNA an increase in fluores- 
cence in such a PGR indicates a positive 
amplification, which can be easily moni- 
tored externally. In fact, amplification can 
be continuously monitored in order to 
follow its progress. The ability to simulta* 
tieously amplify specific DNA sequences 
and detect the product of the amplification 
both simplifies and improves PGR and 
may facilitate its automation and more 
widespread use in the clinic or in other 
situations requiring high sample through- 
put 

Although the potential benefits of PCR 1 to. clin- 
ical diagnostics arc wctt known?- 5 , it is still not 
widely used it* this setting, even diough il w 
yea re tinea thermostable DMA. polymer- 
ase* 4 made PCR p*acti caL Some of the reasons for its slow, 
acceptance are high cost, lack of automation of pre- and 
post~PCR processing steps, and false positive results, from 
carryovcT-contamination. The first two points arc related 
in that labor is the largest contributor to cost a* the present 
stage of PCR development* Most Current assays requite 
sotoc form of "downstream" processing once tbenuocy* 
ding is done in order to determine whether the target 
DNA sequence was present and has amplified. These 
include DNA hybrkhwdpn***, ge! electeopboresis with or 
without use of restriction digestion 7 :*/ HFLC?, or capillary 
dectrophoresu 10 . These methods are labor-intense, have, 
low throughput, and axe difficult to automate. The third 
point is also closer/ related to downstream processing. 
The handling of the PCR product in these downstream 
processes increases the chances thai amplified DNA .will 
spread through the typing- lab, resisting in a .risk of 



"carryover" false positives in subsequent testing 11 . 

These downstream processing steps would be elimi- 
nated if specific amplification and detection of amplified 
DNA took place simultaneously within an unopened re- 
action vessel Assays m which such different processes take 
place without, the need to separate reaction components 
nave been termed , \homogeneous n . .No truly homoge- 
neous PCR assay has been demonstrated to date, although 
progress towards this end has been reported/ Chehab, et 
al. 1 *, developed a PCR product detection scheme using 
fluorescent primers that resulted in a fluorescent PCR 
product Aildc-specific primers, each with different fluo- 
rescent tags, were used to indicate the genotype of trie 
DNA. However, the unincorporated primers must still be 
removed in a downstream process in order to visualize the 
result Recently, Holland, et al is , developed an assay in 
which the endogenous 5' exonuclease assay of Taq DNA 
polymerase was exploited to cleave a labeled oligonucleo- 
tide probe. Hie probe would only dexve if PCR ampuft- 
cation had produced its complementary sequence, to 
order to detect the dcavage products, however, a subse- 
quent process is again needed. 

We have developed a truly homogeneous assay for PCR 
and PCR product detection based upon tbc gready in- 
creased fluorescence that ethidium bromide and other 
DNA binding dyes exhibit when they are bound .to.ds- 
DNA t4 ^ 16 . As oudinci in Figure h a prototypic PCR 



/ 



nDNApcitpco 



J*DN A. containing 
(aptoppxraoumiO 



fnx F.tTXr 



irolnJy jiDNA 
contain k£ 

«naA prima. 




diONAfCRfrodiKt 

\ ftindplc of simultaneous amplification and- detection Of 
PCR product: The components of a PCR cootainb^ EtBr that arc 
fluorescent are listed— EtBr itself, EtBr bound to other ssDNA or 
daDNA, There is a large ouoreseencc enhancctnent when EtBr is 
bound to DNA and hmding is greatly enhanced when DNA Ls 
douhfe-stranded. After sumdent <n).. cycles of PGR* the .net 
increase in dspNA resuks in additional EtBr binding, and a net 
increase in total fluorescence: 
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Wfttft* 2 Gd dcctrophoresis of PGR amplification products of the 
human, nuclear gene, HLA DQtt, made in the pretence of 
increasing amounts of EtBr (up to 8 H-gftoI). The presence of 
EtBr lias no obvious effect on the yield or spedfidty of amplifi- 
cation. 





(A) Fluorescence measurement* from PCRs that contain 
0.5 pgfad EtBr and that arc specific for Y-chroinosoroc repeat 
seotfenoe*. Five replicate PCRs *ere begun containing each of the 
DNAs specified. At cacn indicated cycle, one of the five replicate 
fCRs for each DNA -was removed from thcrmocydmg and Hs 
fluorescence measured, Unit* of fluorescence art arortrarv. (B) 
UV photography of PCRtubei (0.5 ml Eppcndorf^tylc, potypro* 
pylcne micro~ccntri£ugc tubes) containing reactions, those start, 
ing from 2 Jig male DNA and control reactions without any DNA, 
from (A), 



begins with primers that are single-stranded DNA (ss- 
DNA), dNTPs, and DNA polymerase: An amount of 
dsDNA containing the target sequence (target DNA) is 
also typically present* This amount can vary, depending 
on the application, from sinffie-cell amounts of DNA 17 to 
micrograms per PGR* 8 , If EtBr is present, the reagents 
that will fluoresce, in order of increasing fluorescence, are 
free EtBr itself, and EtBr bound to the smgk-fitrandcd 
DNA primers and to the double-stranded target DNA (by 
its intercalation between the stacked bases of the DNA 
dooblc-hcfix). After the first denaturation cyde, target 
DNA will be largely single-stranded. After a PGR ia 
completed, the most significant change is the increase in 
the amount of dsDNA (the PGR product itself) of up to 
several micrograms. Formerly free EtBr is bound to the 
additional dsDNA* resulting in an increase in fluores- 
cence There is also some decrease in the amount of 
ssDNA primer, but because the binding of EtBr to ssDN A 
is much Jess than to dsDNA, the effect of this change on 
the total fluorescence of the sample is small. The fluores- 
cence increase can be measured by directing excitation 
illumination through the walls of the amplification vessel 



before and after, or even continuously during, thermocy 
ding. r 

RESULTS 

PCR in the presence of EtBr. In order to assess the 
affect of EtBr in FOR, amplifications of the human HLA 
DQct gene >9 were performed with the dye* present at 
concentrations from 0,06 to 8.0 figfm! (* typical conce*. 
oration of EtBr used m staining of nucleic actds following 
get electrophoresis is 0.5 y.g/mf). As shown in Figure 2, gel 
electrophoresis revealed little or no difference in the yield 
or quality of the amplification product whether EtBr was 
absent or present at any of these concentrations, indicat- 
ing that EtBr does not inhibit PCR, 

Detection of human Y-c4tromcwm« specific st> 
onences. Sequence-specific, fluorescence enhancement of 
EtBr as a result of PGR was demonstrated in a scries of 
amplificariotis containing 03 u.g/ml EtBr and primers 
specific to repeat DNA sequences found on the human 
Y-chromosomc 46 - These PCRs initially contained either 
60 ng male. 60 ng female, 2 ng male human or no DNA. 
Five replicate PCRs were begun for each DNA* After Q, 
1 7, 2 3 , 24 and 29 cycles of thenuocyding, a PGR for each 
DNA was removed from the thermocyder, and its. fluo- 
rescence measured in a spcctroflnoroMetex and plotted 
vs. amplification cyde number (Fiff. 3A). The shape of this 
curve tc Seas the fact that by the time an increase in 
fluorescence can be detected, the increase in DNA is 
becoming linear and not exponential with cyde number: 
As shown, the fluorescence increased about three-fold 
over the background fluorescence for the PCRs contain- 
ing human male DNA, but did not signincantiy increase 
for negative control PCRs, which contained either no 
DNA or human female DNA. The more male DNA 
present to begin with — 60 ng versus 2 ng— the fewer 
cycle* were needed to give a detectable increase in fluo- 
rescence. Gel eJectioohoresis oo the products of these 
amplifications showed that DNA fragments of the ex- 
pected size were made in the male DNA containing 
reactions and that Enle DNA synthesis took place in the 
control samples. 

In addition, the increase in. fluorescence was visualized 
by simply laying the completed, unopened PCRs on a UV 
transilluminatOT and photographing them through a red 
niter. This is' shown in figure SB tor the reactions that 
began with 2 ng male DNA and those with no DNA. 

Detection of specific allele* of the human p-globin 
gene. In order to demonstrate that this approach has 
adequate spedficity to allow genetic screening, a d&ccrion 
of the' skkle-ccll anemia mutation was performed Figure 
4 shows the ftuoresotnee from completed amplifications 

containing EtBr (0.5 ng/ml) as detected by photography 

of the reaction cubes on a UV ttansfflominator. These 
reactions were performed using primers specific for ei- 
ther the. w3d-tvpe or sickle-cell mutation of the human 
P-globin gene* \ The specifkity for each allele i$ imparted 
by placing the sickie-inutauon site at the terminal 3' 
nucleotide of one primer. By using an appropriate primer 
annealing temperature, primer extension — and thus an> 
pUh^ion--can take place only if the S' nucleotide of the 
primer is complementary to the {H^obin aDde present* 5 
. Each jpair of amplications shown in Figure 4 consists of 
a reaction with either the wild-type allele specific (left 
tube) or siekle-alleie specific (right tube) primers. Three 
different DNAs. were typed: DNA from a homozygous, 
wrid-rype 0-globin individual (AA); from a heterozygous 
sickle p~gipbin individual (AS); and from a homozygous 
sickle p-gtobb individual *(SS). Each DNA (50 ng genomic 
DNA to start each PGR) was analyzed in triplicate (3 pairs 
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0 f reactions each). The DNA .type vas reflected in the ' 
rtjatrve fluorescence intensities in each pair of competed 
p^Mtfcatioiifi. There was a significant increase in fluores* 
ccflCC only where a (^globin allele DNA matched the 
primer set. When measured • on a spectroflnorotneter 
(oVtn not shown), this fluorescence was about three times 
present in a PGR where both p-globm alkies were 
^matched to the primer set. Gel electrophoresis (not 
phown) esraplfahcd that this increase in fluorescence was 
due to the synthesis of nearly a microgram of a DNA 
fragment of the expected size for p^lobtn. There was 
little synthesis of dsDNA in reactions in . which the alkie- 
specific primer was mismatched to both alleles* 

Continuous monitoring of a PGR* Using a fiber optic 
device? K is possible to direct excitation illumination from 
p spectrofluotometer to a PCR undergoing thcrmocyding 
and to return its fluorescence to the Rpectroftuorometer. 
Ific fluorescence readout of such an arrangement, di- 
ttoed at an EtBr-eontaining amplification of Y-chroroo 
some specific sequences from 25 njr of human mate DNA, 
is shown in Figure 5. The readout from a control PCR ■ 
vHli no target DNA is also shown. Thirty cycles of PCR 
were monitored for each. 

The fluorescence trace as a function of time dearly 
shows the effect of the thermocyding. Fluorescence inten- 
sity rises and. fails inversely with temperature* The fluo- 
rescence intensity is minimum at the denaturation tem- 
pera ture (94 6 C) and maximum at thcanneaUn ^extension 
temperature (5<PC). In the negative-control PCR, these 
fluorescence maxima and minima do not change signifi- 
cantly over the thirty tbcrraocydes, indicating that there is 
little dsDNA synthesis without the appropriate target 
DNA, and there is little if any btewhjra of EtBr during 
the continuous nhimination of the sample. 

In the PCR containing male DNA, the fluorescence 
maxima at the annealing/extension temperature begin to 
increase at about 4000 seconds' of therroocycling, and 
continue to increase with time, indicating that dsDNA is 
being produced at a detectable level* Note that the fluo- 
rescence minima at the denaturation temperature do not 
fligniftcandy increase, presumably because at this temper- 
ature there is no dsDNA for EtBr to bind. Thus the course 
of the amplification is followed by tracking the fluores- 
cence increase at the annealing temperature. Analysis of 
ihc products of these two amplifications by gel electropho- 
resis showed a J}NA fragment of the expected size for the 
male DNA containing sample and no detectable DNA 
»ynthe*is for the control saropte* 

DISCUSSION 

Downstream processes such as hybridization to a se- 
queace-Apedfic probe can enhance die specificity of DN A 
detei.ia.Hi by PGR* The cUiniixxtkm of (hcac proeewca- 
means that the specificity of this homogeneous assay 
depends solely on that of PCR. In the case of sickle-eeli 
disease, wc have shown that PCR alone has sufficient DNA 
sequence specificity to permit genetic screening. Using 
appropriate amplification conditions, there is little oon* 
specific production of dsDNA in the absence of the 
appropriate target allele. 

The specificiiy required to detect pathogens can be 
more or less than that required to do genetic screening, 
depending on the number of pathogens in the sample and 
the amount of other DNA that must be taken with the 
sample. A difficult target is HIV, which requires detection 
of a viraJ genome that can be at the level of a few copies 
per thousands of host cells 5 . Compared with genede 
screening, which is performed on ceils containing at least 
one copy of the target sequence* HiV [detecdon requires 
both more specificity and the input of more ntfal 
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RWK 4 ITV photography of PCR tubes containing JunpBficata>n* 
using EtBr tfot are specific to wild-type (A) or *icWe (5> alkies of 
the human p-globin gene. The left o* each pair of tubes contains 
aBele^pedfic primers to the wild -type aDeks, the right lube 
primers to the skWe allele. The phwograph was ulcn after SO 
cycles of PCR, and the input DNAs and the alleles they contain 
are indicated- Fifty tog of DNA was used to bcero PGR. TyjW 
was done in triplicate (5 paitt of PCRs) for each mpm DNA: 



25 C 



20r>g of mate DNA 



25 c 
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no DNA control 
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ftCOKS Continuous, real-time monitoring of a PCR. A fiber optic 
was used to carry, excitation Rgfct to a Pt>R m progress and also 
eroHt*a* li^it back to a fluoro meter (sec Exoenmcntal hotocoJ). 
AnrpSficauon -using human malo-DNA specific pnmcn fn a PCR 
starting with 20 ng of human male DwA (top), or in a control 
PCR without DNA (bottom), were monitored. Thirty eydes of 
PCR were f oJiowed for each. The* temperature cycled between 
94*C (denaturation) and WPC (annealing -extension). Note in 
the male DNA PCR,. the cycle (rime) depea»cnt inaeasc in 
fluorescence ol the anneaEng/extenaion temperature, 
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DNA— up to microgram amounts— i a order to have suf- 
6dcnt numbers of target sequences. This large amount of 
afcuting DNA m an ampli&eatsori stpubam&y increases 
the background fluorescence over which any additional 
fluorescence produced by PGR must be detected. An 
additions] complication that occurs with targets trt tow 
copy-number is the formation of the ^rimcr-dimer 0 
artifact This is the result of the extension of one primer 
using the other primer as a template. Although this occurs 
infrequently, once it occurs the extension product is a 
substrate for PCR amplification, and can compete whh 
true PCR targets if those targets are rare. The primer- 
dimer product is of course dsDNA and thus is a potential 
source of false signal in this homogeneous assay, 

To increase PGR specificity and reduce the effect of 
primer-dimcT amplification, we are mvesdgating a nunv 
her of approaches, including die use of ncstcd-primer 
amplifications that take place in a single tube 8 , and the 
^hot-start" in which nonspecific amplification » reduced 
by raising the temperature of the reaction before DNA 
synthesis begins 85 . Preliminary results using these ap- 
proaches suggest tbacprimcrHJuTRCT is effectively reduced 
and it is possible to detect the increase in EcBr fluores- 
cence in a PCR instigated by a single HIV genome in a 
background of 10* cells. With larger numbers of cells, the 
background fluorescence contributed by genomic DNA 
becomes problematic. To reduce this background, it may 
be possible to use sequenee^pecific DNA-binding dyes 
that can be made to preferentially bind PCR product over 
genomic DNA by incorporating the dye-binding DNA 
sequence into the PCR product through a 5' "add-on" to 
the oligonucJeotide primer 2 ' 1 . 

We nave shown that the detection of fluorescence 
generated by an EtBr-containing PCR is straightforward, 
both once PGR is completed and continuously during 
ihermocycfing. The ease with which automation of spe- 
cific DNA detection can be accomplished is the most 
promising aspect of this assay. The fluorescence analysis 
of completed PCRs is alrcadypossibJc with existing instru- 
mentation in 96-well formar*. In this format, the fluores- 
cence in each PCR can be <juantitated before, after, and 
even at selected points during thermocyctirig by moving 
the rack of PCRs to a 96-microweH plate fluorescence 
reader 26 . 

The instrumentation necessary to continuously monitor 
multiple PCRs simultaneously is also simple in principle, 
A direct extension of the apparatus used here is to have 
multiple fiberoptic* transmit the excitation light and flu- 
orescent emissions to and from multiple PCRs. The ability 
to monitor multiple PCRs continuously may allow quan- 
titation of target DNA copy number. Figure 5 shows that 
the larger the amount of starting target DNA, the sooner 
during PCR a fluorescence increase is detected. Prelimi- 
nary experiments <Higuchi and Bollinger, manuscript in 
preparation) with continuous monitoring have shown a 
sensitivity to two-fold differences in initial target DNA 
concentration. 

Conversely* if the number of target molecules is 
known — as it can be in genetic screeTung-TCominuous 
monitoring may provide a means of detecting fahc posi- 
tive and false negative results. With a known number of 
target molecules, a true positive would exhibit detectable 
fluorescence by a predictable nuotber of cycles of PCR. 
Increases in fluorescence detected before or after that 
cycle would indicate potential artifacts. False negative 
results due to, for example,. inhibition of DNA j^tymer> 
ase, may be detected by including within each PCR an 
inefficiently amplifying marker. This marker results in a 
fluorescence increase only after a large number of cy- 
cles — many more than are necessary co detect a true 
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positive. If a sample fails to have a fluorescence increase 
after this many-cycles, inhibition may be suspected. Since, 
in this assay, conditions are drawn based on the presence 
or absence of fliiorescence signal alone, such controls may 
be important. In any event before any test based on this 
principle is ready for the cimicv an assessment of its false 
positive/false negative rates wfli need to be obtained using 
a large number of known samples. ^ 
In summary, the inclusion m PCR of dyes whose fluo- 
rescence is enhanced upon binding dsDNA maltcs it 
possible to detect specific DNA amplification from outside 
the PCR tube. In the future, instruments based upon this 
principle may facilitate the more widespread use of PCR 
in applications that demand the high throughput of 
samples. 

EXPERIMENTAL PROTOCOL 

Human HLA-DQ" gent smpKBcations obtaining EtBi. 
PCRs were set up in 100 nJ volemes eoniaming 10 mM Tris-HQ, 
pH 8.3; 50 mM KC1; 4 mM MgC^: S-5 units of toe DNA 
polymerase (Perkm*Ehwcr Cctua. Norwalfc, CT); 20 piriote each 
of human HtA-DQa ' gene specific oligonucleotide primers 
(JH26 and CH27 19 aid approximately HP copies of DQa PCfc 
product diluted from a previous reaction. Ethidium bromide 
fatBr; Sigma} was used at tbe concentrations uidioied in Figure 
z, Tnermocyding proceeded for 20 cycles in a mode! 460 
Utcrmccyder (PerM>-Elmer Cctua, Norwalk, CT) using a "stcp- 
cydc" program of S4*C for 1 mm-dcaaturauon and 6CrC for#) 
sec annealing and 72°C for 30 kg. extension. 

Y-chroutosome specific PCR. PCRs (J 00 ul total reaction 
volume) containing 0.5 pc/ra) CtBr were prepared as described 
for HLA*DQcr r except with different primers and target DNAs, 
These PCRs contained ) 5 pmob each male DN A-spccific primes 
VI . J and V 1.2*°, and cither 60 ng male, 60 Off female, 2 ng male, 
or no human I>NA. Th e rmoc yd ing was 94*CTor 1 min- and 60^C 
for 1. min using a "step-cycle* program- The number of cydes for 
a sample were as indicated in Figure 3. Fluorescence measure- 
ment is described below. 

All ck -specific, human 0-gtobm gcot PCR* Amptincaiions of 
100 pi vojume using 0.5 pgAnl of ZtBr were prepared 
described for HLA-DQa above except with different primers and 
target DNAs. These PCRs contained either primer pair HGPJ/ 
HP HA <wBd-type globm specific primers) or HCKt/HpMS (a*A- 
lc-giobin specific primers) at 10 pmole each primer per PCR, 
These primers were developed by Wu ct aL 21 . Three different- 
target HN Aa were used to separate amplifications— 60 ng each of 
human DNA that was homozygous for the sickle trait (SS), DMA 
that was heterozygous for the sickle traK (A$X or DNA that was 
homozygous for uw w.i. £iobm (AA>. ThcrmocycnTtg was for i0 
cycles at 94*C for 1 min. and 55*C for 1 min. itsi»| a "stcjp-cyde" 
program. An anneaHng temperature of 55°C h^d hcen shown by 
Wu et al 21 to provide allcJe-spedfic atnpiiAcaiion. ComplOed 
PCRs were prtcrtngraphed through a red filter {Written 23 A) 
after placing the reaction tubes atop a model TM-S6 transiHiiOli- 
natnr (UV-products San-Gahriel, CA). 

Fruorescence measnremetrt. Fluorescence rncasfiircment* were 
madV on PCRs containing EtBr in a fluorolog-2 Ouoromcter 
0P£X« Edison, NJ). Excitation was at the 500 did band with 
ahour 2 nm bandwidth with a GO 45S nm cut-orTftUerjMeHes 
Grist Inc.. Irvine. CA) to exclude second-order light. Emitted 
light was detected at 570 nm with a bandwidth of about 7 nrn« An 
OG 530 nm cut-off filter was used to remove the exdtauon light* 

Continuous ftnorescence nwu i tor i ng of PCR. Continuous 
monitoring of a PCR in progress was accomplished using the 
Bjpcctrofluorometer and settings described Above as well as a 
nWoptic accessory (SP£X cat no. 1950) to both send exdtauon 
fight to, and receive emitted Ught from, a PCR placed m a well of 
a model 480 mernvocydcr (Pcrkm-Elmer Cetus). The probe end 
of the fiberoptic cable was attached with "5 mmute-cpox)'* to the 
open top of a PCR tube (a O.o ml po^ror>yiene centrifuge tube 
with its cap removed) effectively scaling it The c*pose<f top of 
the PCR tube and the end of the fiberopuc caWc were slucldcd 
from room light and the room lights were kept dimmed during 
• each run. The monitored PCR was an amplication of V-cb^c 
mosome^pecinc repeat sequences as described above, except 
using ^n anncaHng^extension temnerauirc of 50°C. The reacuon 
was covered whh mineral oil (2 drops) to prevent evaporation- 
TfacjTnocycfing and* SuorcSCCflCC ED wurxmcnt were star led si- 
multaneously. A urne-bascsean witfi a 10 second mitgranon tunc 
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*m urtd and the emtacioa signal was ratioed to' tbr. cxoitatfoo 
sigswl to control for Cb*t5#» in Jteht-jKjurcc latency. O&ta.wcre 
Reeled using the draSOOOf, vwion 15 (SPEX) data system. 
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IMMUNO BIOLOGICAL LABORATORIES 



SCD-14EUSA 

Trauma, Shock and Sepsis 




The CD- 14 molecule is* expressed on the surface, of 
monocytes and some macrophages. Membrane- 
bound CD-14 is a receptor for lipopo!yGaccharid9 
(LPS) complexed to LPS-Binding-Protein (LJBP). The 
concentrailon of te soluble form is aftsfed under 
certain pathological conditions. There, is evidence for 
an important role of $CD-14.vvith polytrauma, sepsis, 
burnings and intemmations. 
During septic conditions and acute infections it seems 
(0 be a prognostic marker -and is therefore of vaJue in 
monitoring these patients. * 



1BL offers an ELISA for quantitative determination of 

soluble CD-14 in human serum; -plasma, cell-culture 

supernatants and other biological fluids. 

Assay features: 12x8 determinations 
(microliter strips), 
precoated with a specific 
monoctonal antibody, 
2x1 hour incubation, 
standard range: 3 - 96 ng/rm 
detection limit: 1 ng/ml 
CV: intra- and ffiterassay < 8% 



for more information caU or fax 
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SIMULTANEOUS AMPLIFICATION AND DETECTION 01 
SPECIFIC UNA SEQUENCES 

Russell Higucbi*, Gavin DolHaag^r 1 , Sean Walsh and Robert Griffith 

Roche Molecular Systems. Inc.. 140053rd St., Emeryville, CA 94008. 'Chiron Corporation, 1400 53rd St, Emcryvrflc, CA 
94608, ^Corresponding author. 



We have enhanced the polymerase chain 
reaction (PGR) such that specific DNA 
sequences can be detected without open- 
ing the reaction tube. This enhancement 
requires the addition of ethidium brxmnde 
(EtBr) to a PGR- Since the fluorescence of 
EtBr increases in the presence of double* 
stranded (ds) DNA an increase in fluores- 
cence in such a PGR indicates a positive 
amplification, which can be easily moni- 
tored externally. In fact, amplification can 
be continuously monitored in order to 
follow its progress. The ability to simulta- 
neously amplify specific DNA sequences 
and detect the product of the amplification 
both simplifies and improves PGR and 
may facilitate its automation and more 
widespread use in the clinic or in other 
situations requiring high sample through- 
put 

Although the potential bjcncfi.ts of PCR* to. clin- 
ical chagnosucs arc well known 2 9 , it to ^liU not 
widely used m this setting, even though w 
forur years ciuco thermostable DMA polymer- 
ases* made PCR practical. Some of the reasons for Its $low. 
acceptance are high cost, lack of automation of pre- and 
post-PCR processing steps, and false positive results, from 
carryovcr<onumination. The first two point* arc related 
in that labor is the largest contributor to cost it the present 
stage of PCR development. Most current assays require 
bottjc form of "downstream" processing once thermocy* 
ding is done in order to determine whether the target 
DNA sequence was present and has amplified. These 
include DNA hybridisation^ ge! electrophoresis with or 
without use of restriction digestion*;*, HPIX? , or capillary 
electrophoresis 10 . These methods are labor-intense, have, 
low throughput, and arc difficult to automate. The third 
point is abo closely related to downstream processing. 
The handling of the PCR product in these downstream 
processes increases the chances that amplified DNA '.will 
spread through the typing lab, resulting in a .risk of 



carryover" false positives in subsequent testing 11 . 
These downstream processing steps would be elimi- 
nated rf specific amplification and detection of amplified 
DNA took place simultaneously within an unopened re- 
action vessel Assays in which such different processes take 
dace without, the 'need to separate reaction components 
iave been termed •%omogeneous" , . 2^0 truly hbmogc-. 
tieous PCR assay has been demonstrated to date, although 
progress towards this end has been reported. Chehab, et 
a}. 1 * developed a FCR product detection scheme using 
fluorescent primers that resulted in a fluorescent PCR 
product AHc4c-5pecif!c primers, each with different fiuo- 
rescent tags, were used to indicate the genotype of the 
DNA. However, the unincorporated primers must still be 
removed in a do wnstream process in order to visualize the 
result Recently, Holland, et al 13 , developed an assay in 
•which the endogenous 5' exonudease assay of Taq DNA 
polymerase was exploited to cleave a labeled oligonucleo- 
tide probe. The probe would only cleave if PCR ampli- 
cation bad produced its cororJemeatary sequence, to 
order to detect the dcavage products, however, a subse- 
quent process is again needed. 

We have developed a truly homogeneous assay for PCR 
and PCR product detection based upon the gready in- 
creased fluorescence that ethidrara bronude and other 
DNA binding dyes exhibit when they are bound to.ds- 
DNA t4 ^ l «. As outhncc in Figure I; a prototypk PCR 



/ 



sxDNA priixra 



pnrecci 




rtrp^t sequence 



diONArCR product 

1 -Principle of simultaneous amplification and- detection of 
PCR produce The components of a PCR containing EtBr that are 
fluorescent are listed— fefir trselfc EtBr bound to other ssDNA ot 
dsDN A There is a large uuorc9cencc enhauccmcnt when EtBr is 
bound to DNA and r^dinjr is greatly enhanced when DMA .is 
doublc-stranoxd. After su&dcxit <h)..cydcs of PGR, the .net 
inaxswe m dajDNA results tn additional £u>r binding, and a net 
increase in total fluoaresccncEi 
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a Gel electrophoresis of rTft am pHficatnn products of the 
human, m^car gene, HLA DQa, made in the presena: of 

*t£T^ 6 an ^? ni °LF** T (u e tQ .* The presence of 

EtBr has no obvious effect on the yield or apedfidty of amplifi- 
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flGQH * (A) Fluorescence measurement* from PCR$ that contain 
0.5 u-gfin! EtBr and that are specific for Y-chromojOinc repeat 
^aoedoe*. Five replicate PCRs were begun containing each oitbe 
DNA* speafied. At each indicated cycle, one of the five replicate 
PC** ft>r each DNA *as removed from thcrmocydmg and Hs 
fluorescence measured. Unit* of fluorescence are arbitrary. (B> 
UV photography of PGR tubes (0.5 ml Eppcndorf-itylc, boYyprO* 
pylcne mtcro^nmW iubcs) confining reactions, those state* 
ing from 2 ng male DNA and control reactions without any DNA, 
from (A/, 



begins with primers tliat are single-stranded DNA (ss- 
DNA)» dNTPs, and DNA polymerase; An amount cf 
dsDNA containing the target sequence (target DMA) is 
also typically present This amouat can vary, depending 
on the application, from single-cell amount* of DNA 17 to 
micrograms per PGR* 8 , If EtBr is present, the reagents 
that will fluoresce, in order of increasing fluorescence, are 
free EtBr hscif, and EtBr bound to the single-stranded 
DNA primers and to the double-stranded target DNA (by 
its mtercalauon between the stacked bases of the DNA 
doubJc-hcfix). After the first denaturation cyde, target 
DNA will be largely single-stranded. After a PGR is 
completed, the most significant change is the increase in 
the amount of dsDNA (the PCR product itself) of up to 
several rnkrcftrams. Formerly free EtBr is bound* to the 
additional dsDNA, resulting in an increase in fluores- 
cence* There is also some decrease in the amount of 
ssDNA primer, but because the binding of EtBr to ssDNA 
ts much Jess than to dsDNA, the effect of this change on 
the total fluowccncc of the sample is smalL The fluores- 
cence increase can be measured by du-ecting excitation 
iUuminaxion through the walls of the amplification vessel 
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b^orc and after, or even com inuously during, thermocy- 
RESULTS 

PCR in die presence of EtBr. to order to assess th* 
affect of EtBr in FOR, ampHrkadons of the human HLA 
DQct gene >5> were performed with the dye present at 
concentrations from 0.06 to 8.0 n-gfrnl (a typical concen 
traubn of EtBr used in staining of nucleic aads follo^n* 
gel electrophoresis is 0.5 u.g/mf). As shown in Figure 2, *| 
electrophoresis revealed liule or no di^fcrencc in theyiSd 
or qualay of the ampufication product whether EtBr was 
absent or present at any of these enncentKu^om. indicate 
ing^atEtBr does not inhibit PCR, 

Detection of human Y-chrozaosoiuo specific st* 
ounces. Sequence-specific, fluorescence enhancement of 
EtBr as a result of PCR was demonstrated in a scries of 
amplifications containing 0.5 u^/mi EtBr and primers 
sr^eofic to repeat DN*A sequence* found on the human 
V-chromosorric 20 . These PCRs initiairy contained cither 
60 ng male. 60 ng femBlc, 2 ng roak human or no DNA. 
Five replicate PCRs were begun for each DNA. After 0, 
17, 21 , 24 and 29 cycles of thermocyding, a PGR for each 
DNA was removed from the therniocyder, arid its fluo- 
rescence measured in a specrrofluorometer and plotted 
vs. amplifieation eyde number (Fiff. 3A), The shape of this 
curve reflects the feet that by die time an increase in 
fluorescence can be detected, the increase in DNA is 
becoming linear and not exponential with cycle number: 
As shown, the fluorescence increased about three-fold 
over the background fluorescence for the PCRs contain- 
ing human male DNA, but did not signiflcanriy increase 
for negative control PCRs, which contained either no 
DNA or human female DNA. The more male DNA 
present to begin with— 60 ng versus 2 ng— die fewer 
cycle; were needed to give a detectable increase in fluo- 
rescence. Gel decuophorests oo the products of these 
amplifications showed that DNA fragments of the ex- 
pected swe were made in the male DNA containing 
reactions and that little DN A synthesis took place in the 
control samples. 

In addition, die increase in fluorescence was visualized 
by shnply laying the completed, unopened PCRs on a UV 
transilhiminatOT and photographing them through a red 
Alter. This is shown in figure 5B tor the reactions that 
began with 3 ng male DNA and those with no DNA- 

Detection of specific allele* of the human p-globin 
gene. In order to demonstrate that this approach has 
adequate specificity to allow genetic screeiung, a d&cction 
of the skkle-ceil anemia mutation was performed* Figure 
4 shows the fluorescence from completed ampfi&cations 

containing EtBr (0.5 p-g/mi) a* detected by photography 
of the reaction cubes on a UV transiUuminator. These 
reactions were performed using primers specific for ei- 
ther the wild-type or sickle-ceil mutation of the human 
P-globin gene* \ The sr^ccifkity for each allele is unparted 
by placing the sickle-mutation site at the terminal 3' 
nucfeonde of one primer. By using an appropriate primer 
annealing temperature, primer extension — and thus an> 
plioeat}cn--can take place only if the 5' nudeotkle of the 
primer is complementary to the ^giobui aDdc present" 

Each jpair of amplications shown in Figure 4 consists of 
a reaction with either the whUypc allele specific (left 
tube) or sickle-alleie specific (right tube) primer*. Three 
different DNAs. were typed: DNA from a homozygous, 
wtid-type p~globin individual (AA); from a heterozygous 
skkle 0-gIobtn individual (AS); and from a homozygous 
sickle p-giotfo individual (SS). Each DNA (50 ng genorok 
DNA to start each PGR) was Analyzed irt triplicate (3 pairs 
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0 f reactions each). The DNA .type ras reflected in the ' 
^tive fluorescence intensities in each pair of completed 
amplification*. There was a significant increase in fluores* 
ccnoc only where a ^gtobin allele DNA matched the 
«rimer set. When measured on a spcctroflnoronietcr 
(dam not shown), this fmoiesccncc was about three times 
d*t present in a PCR where both &-gtobin alkies were 
jnbiiiatchcd to the rmmer sec Gel electrophoresis (not 
ijhown) established that this increase in fluorescence was 
due to the synthesis of nearly a microgram of a DNA 
fragment of the expected size for p^lobin. There was 
little synthesis of dsDNA in reactions in. which the allele- 
.npedfic primer was mismatched to both alleles* 

Continuous m<mxtoriog of a PGR* Using a fiber optic 
devkerH i* possible to direct excitation illumination from 
? spectrofluorometer to a PGR undergoing thcrmocycling 
and to rearm its fluorescence to the Rpectroftuorometer. 
lie fltiorcsccncc readout of such an arrangement, di- 
rected ri an EtBr-containtng amplification of Y<hromo- 
some spcciGc sequences from 25 ng of teman male DNA, 
is shown in Figure 5. The readout from a control r*CR 
wiUi no target DNA is also shown. Thirty cycles of PCR 
were monitored for each. 

The fluorescence trace as a function of time dearly 
shows the effect of the mermocyding. Fluorescence inten- 
sity rises and. fails inversely with temperature The fluo- 
rescence intensity is minimum at the denaturation tem- 
perature (9i°C) and maximum at the annc^Un ^extension 
temperature (50°C). I» the negative-control PCR, these 
fluorescence maxima and minima do not change signifi- 
es nUy over the thirty tbcrmocycleit, indiatmg that there is 
Kulc dsDNA synthesis without the appropriate target 
DNA, and there ts little if any bleaching of £«Br during 
the continuous illuTninarion of the sample. 

Jn the PCR containing male DNA, the fluorescence 
maxima at the annealing/extension temperature begin to 
increase at about 4000 seconds' of thermocycling, and 
continue to increase with time, indicating that dsDNA is 
being produced at a detectable level Note that the fluo- 
rescence minima at the deoacuratiou temperature do not 
Aigruficandy increase, presumably because ax thh temper- 
ature there is no d&DNA for EtBr to bind. Thus the course 
of the amplification is followed by tracking the fluores- 
cence increase at the annealing temperature. Analysis of 
i.hc products of these two amplifications by gel electropho- 
resis showed * DNA fragment of the e xp e ct ed size for the 
mate DNA containing sample and no detectable DNA 
synthesis for the control sample. 

DISCUSSION 

Downstream processes such as hybridization to a se- 
quence-specific probe can. enhance the specificity of DNA 
detevuuu by FCR. The chmiixttion of d-icac processes 
means that' the spceifidty of this homogeneous assay 
depends solely on that of tCSL In the case of sickle-celi 
disease, we have shown that PGR alone has sufficient DNA 
sequence spedficky to permit genetic screening. Using 
appropriate amplification conditions, there is Iittfc non- 
specific production of dsDNA in the- absence of the 
appropriate target allele* 

Fhc specificity required to detect pathogens can be 
more or less than that required' to do genetic screening, 
depending on the number of pathogens in the samnle and 
the amount of other DNA that must be taken with the 
sample. A difficult target is HIV, which requires detection 
of a viral genome that can be at the level of a few copies 
per thousands of host cells*. Compared with genetic 
screening, which is performed on ceils containing at least 
one copy of die target sequence* HIV [detection requires 
both more specificity and the input of more mtel 
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BOTH 4 UV photography of PCR tubes containing junpuficwiojii 
using EtBr mat are specific* to wild-type (A) or sickle (5> alleles of 
the human £-globin gene. The left of each pair of tubes contains 
aDde-spcdfie primers to the wild-type alleles, the right tube 
primers to the sickle allele- The photowaph was tofcen after SO 
cycles of PCR, and the input DNAs and the alkies thev contain 
are indicated- fifty tog of DNA was used to berin FCPL Typing 
was doac in tripticatc (S pairs of PC&) for each input DNA: 



25°C 



20r>g of male DNA 



25*C 




no DNA control 




0 2000 4000 6000 8000 
time (9ec) 

RfflJK S Continuous, rcaWme monitoring of a PCR. A fiber optic 
was used to carry excitation Kgjit to a rXSR in progress abd also 
emitted light back to a fluorometcr (see ExDentncntal P>otoa>l). 
AmpHficadonU-MCg human msdo-DNA specific primers fn & PCR 
surun? with 20 ng of human male DNA <top> T or in a control 
PCR without DNA (bottom), were wnhorcd. Thiny cycles of 
PCR were foJjowed for each. The temperature cycled between 
94°C (denaturatktti) and 5<fC (annealing and extension). Note in 
the male DNA PCR, the cycle (dme) depeo^cet feaeasc in 
florescence at the anueaBft^exteoaion temperature. 
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DNA — up to microgram amount?— in order to have suf- 
ficient numbers of target sequences. This large amount of 
starting DNA m an amplification, sigai&candy increases 
the background fluorescence over which any additional 
fluorescence produced by FCR must be detected. An 
additional complication that occurs with targets irt low 
copy-number is the formation of the lfc primet-<iimer n 
artifact. This is the result of the extension of one primer 
using the other primer « a template. Although this occurs 
infrequently, once it occurs the extension product is a 
substrate for PCR amplification, and can compete with 
true PCR targets if those targets are rare. The primer- 
dimcr product i$ of coutsc dsDNA and thus is a potential 
source of false signal in this homogeneous assay. 

To increase PCR specificity and reduce the effect of 
primer-dimcr amplification, we are investigating a num- 
ber of approaches, including the use of nested-primer 
amplifications that take place in a single tube 3 , and the 
"hot-start", in which nonspecific amplification » reduced 
by raising the temperature of the reaction before DNA 
synthesis begins 25 . Preliminary results using these ap- 
proaches suggest tbatprimcrnliroeT is effectively reduced 
and it is possible to detect the increase in EtBr fluores- 
cence in a PCR instigated by a single HIV genome in a 
background of 10* cells. With larger number* of ccHs, the 
background fluorescence contributed by genomic DNA 
becomes problematic. To. reduce this background, it may 
be possible to use sequence-specific DNA-binding dyes 
that can be made to preferentially bind PCR product over 
genomic DNA by incorporating the dye-binding DNA 
sequence into the PCR product through a 5' "add-on" to 
the oliEonudcotide primer* 4 . 

We nave shown that the detection of fluorescence 
generated by an EtBr-containing PCR is straightforward, 
both once PCR is completed and continuously during 
uSermocyding. The ease with which automation of spe- 
cific DNA detection can be accomplished is the most 
promising aspect of this assay. The fluorescence analysis 
of completed PCRs isaIrcadyjx>5Siblc with existing instru- 
mentation in 96 -well formar* In this format, the fluores- 
cence in each PCR can be quantitated before, after, and 
even at selected points during therraocyciing by moving 
the rack of PCRs to a 96-mierowc)l plate fluorescence 
reader 26 . 

The instrumentation necessary to continuously monitor 
multiple PCRs simultaneously is also simple in principle, 
A cUVcct extension of the apparatus used here is to have 
multiple fiberoptics transmit the excitation light and flu- 
orescent emissions to and from multiple PCRs. The ability 
to monitor multiple PCRs continuously may allow quan- 
titation of target DNA copy number, figure 3 shows that 
the larger the amount of starting target DNA, the sooner 
during VClR a fluorescence increase is detected. Prelimi- 
nary experiments {Higuchi and Dollinger, manuscript in 
preparation) whh continuous monitoring have shown a 
sensitivity to two-fold differences in initial target DNA 
concentration. 

Conversely, if the number of target molecules is 
known — as it can be in genetic screening— continuous 
monitoring may provide a means of detecting false posi- 
tive and false negative results. With a known number of 
target molecules, a true positive would exhibit detectable 
fluorescence by a predictable number of cycks of PCR. 
Increases in fluorescence detected before or after that 
cycle would indicate potential artifacts. False negative 
results due to, for example,. inhibition of DNA polymer- 
ase, may be detected by including within each PCR an 
inefficiently amplifying marker. This marker results in a 
fluorescence increase only after a large number of cy- 
cles—many more than are necessary to detect a true 




positive. If a sample fails to have a fluorescence increase 
after this many cycles, inhibition may be suspected. Since, 
in this assay, conclusions are drawn based on the presence 
or absence of fluorescence signal alone, such controls may 
be important, in any event, before any test based on this 
principle is ready for the clinicv an assessment of it* false 
positive/false negative rates will need to be obtained using 
a large number of known samples. 

In summary, the inclusion in PCR of dyes whose fluo- 
rescence is enhanced upon binding dsDNA makes it 
possible to detect specific DNA amplification from outside 
the PCR tube. In the future, instruments based upon this 
principle may facilitate the more widespread use of PCR 
in applicauons that demand the high throughput of 
sampler 

EXPERIMENTAL PROTOCOL 

Human HLA-DQa gene Amplifications containing EtRi. 
PCRs were set up ittlOO p\ volumes containing 10 mM Tris-HQ* 
pH 8.3; 50 mM RCi; 4 mM MgO^: 33 unit* of Too DNA 
polymerase (PerlcitwEJmcr Genu, Norwalk, CT); 20 pinole each 
of human HtA-DQa gene specific oliFOnucleouoe primers 
and CH27 19 and approjcunatelj W copies Of DQfc PCft 
product diluted from a previous reaction. Ethidium bromide 
(EtBr; SigttuO was used at the concentrations indicated b Figure 

Thermocyding proceeded for 20 cycles in a mode! 480 
uWmocydcr (PerWn-EJmer Cccua, Norwalk, CT) ustnfr a "step- 
cycle" program of 94*C for 1 mm-deuaturauon and 6CrC forjo 
sec annealing and 72°C for 30 see. extension. 

Y"chromo9omc specific PCR* PCRs (J 00 ul total reaction 
volume) containing 0.5 ^gAdI EtBr were prepared as described 
for HLA-DQo, except with dhTcrcnt primers and target DNAs. 
These PCRs contained 1 $ pinole each male DN A-$pcciJ5c prime** 
VI. 1 and V 1.2*°, and cither 60 ng male, €0 ne female, 2 ng male, 
or no human DNA» Thermocyding was W*C?or 1 min- and 60?C 
for 1 min using a "step-cycle* program. The number of cycles for 
a sample were as indicated in figure 3. Fluorescence measure- 
ment is described below. 

AUefc-opcccficy human 0*gtobin geu* PGR* AmpUficauons of 
100 pJ vwumc usmff 0 5 pug/ml of XtBr were prepared a* 
described for HLA43Qa above except with different priiners and 
target DNAs. These PCRs •contained euW primer pair HGP$/ 
Hp MA <wiM-type globm specific primers) or HGFSyHjH-tS (side- 
lc-globm specific primers) at 10 pmole each piimcr per PCR, 
Inese primers wrc developed by Wu ct ai_ 21 . Three different 
target nNAft wtre used in separate amplification*-— 50 ng each of 
human ON A that was homozygous for the ftiekte trait (SS)» DNA 
that was heterozygous for the lickle watt (AS), or DNA that was 
homozygous for the w,t. gJobin (AA). ThcrmDcydmg was far 30 
cycles at 94*C for 1 min. and 55'C for 1 min. itsm| a ^p-cyefe 11 
program. An annealing temperature of 55X bad ocen shown by 
Wu et a! 21 to provide aildc^pccific amplification. Completed 
PCRs were photographed through a red filter (Wratien -23A) 
after placing the reaction tubes atop' a model TM-S6 tranatHumi- 
nator (UV-producu San' Gabriel, CA)l 

Fhioresecnee measnremmt. Fluorescence mcasuremcTUs were 
mad* oh PCRs containing EtBr m a Fluoroiog»2 fltioromctcr 
(SF£X, Edison. NJ). Excitation was at the 500 nm band with 
ahour 2 nm bandwidth with a GO 4JV5 nm cut-orTulierjMcJles 
Grist inc.. Irvine. CA) to exclude second-order light. Emitted 
light was detected at 5 70 nm with a bandwidth of about 7 nm. An 
OG 530 nm cut-nfF filter Was used to remove the excitation light. 

Continue*** ftnoreacenee monitoring of PCR, Continuous 
monitoring oiP a PCR in progress was accompfohed using dae 
spectrofiuorometcr and setdngs described Above as well as a 
fioeroptit accessory (SEEX cat no. 1950)- to both send excitation 
fight to, and receive emitted light from, a PCR placed in a well of 
a model 480 ibermocyder (Pcrkm-Elmer Cetus). The probe end 
of the fiberoptic cable was attached with "5 nwutc-cpoxy* to the 
open top of a'PCR tube (a 0.o ml pcAfpropylene cemri£t«e rube 
with its cap removed) effectively seating it The cxposecT top <n 
the PCR tube and the end of the fiberoptic caWe were sliieldcd 
from room light and the room lights were kept dimmed during 
• each run. The monitored PCR was an ampTmeauon of Y-djT©- 
rnospme-spedne repeat sequences as described above, cxcciH 
usmg.an anncating/extension wmperauirc of 50°C. The reaction 
was covered with mineral oil (2 drops) to preve»tt evaporation. 
Thermocyding- and fluorescence measurement verc started s»- 
multancously. A time-basc scan with a 10 second mtegranoo time 
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«xs UFfid and die emission signal was ratioed to tbc excitation 

Sod to control for change* in light-jourcc intensity. {>ata were 
Icc^d using the dro3{X)0f, version 24 (SFEX) data system. 

Wc ttani Bob Jonca for hdp with the spectrofiuormctric 
ortn«Jf«nwxt5and Hcaihcrbell Fong for editing this manuscript 
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sCD-14 EUSA 

Trauma, Shock and Seps 




The CD- 14 molecule is" expressed on the surface of 
monocytes and some macrophages. Membrane- 
bound CD-14 is a receptor for Upopotysaccharide 
(LPS) complexed to LPS-Binding-Protein (IBP). The 
concentration of tts soluble form is altered under 
certain pathological conditions. There, is evidence for 
an Important role of $CD-14.with polytrauma, sepsis, 
burnings and inflammations. 
During septic conditions and acute infections it seems 
to be a prognostic marker and is therefore of value in 
monftoring these patients. 



IBL offers an EUSA for quantitative determination of 

soluble CD-14 in human serum, -plasma, cell-cuiture 

supernatants and other biological fluids. 

Assay features: 12x8 determinations 
(microliter strips), 
precoated with a specific 
monoctona) antibody, 
2x1 hour incubation, 
standard range: 3-96 ng/mi 
detection limit: 1 ng/ml 
CV: intra- and jnterassay < 8% 



For more information call or fax 
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Oligonucleotides with Fluorescent Dyes at 
Opposite Ends Provide a Quenched Probe 
System Useful for Detecting PCR Product 
and Nucleic Acid "- u - SJ! — ■— 




Kenneth J. Livak, Susan JA Flood, Jeffrey Marmaro, William Giusti, and Karin Deetz 

Perkin-EImer, Applied Biosystems Division, Foster City, California 94404 



The 5' nuclease PCR assay detects the 
accumulation of specific PCR product 
by hybridization and cleavage of a 
double-labeled fluorogenlc probe 
during the amplification reaction. 
The probe Is an oligonucleotide with 
both a reporter fluorescent dye and a 
quencher dye attached. An Increase 
In reporter fluorescence Intensity In- 
dicates that the probe has hybridized 
to the target PCR product and has 
been cleaved by the 5' — *3' nucle- 
olytlc activity of Taq DNA polymerase. 
In this study, probes with the 
quencher dye attached to an Internal 
nucleotide were compared with 
probes with the quencher dye at- 
tached to the 3'-end nucleotide. In all 
cases, the reporter dye was attached 
to the 5' end. All Intact probes 
showed quenching of the reporter 
fluorescence. In general, probes with 
the quencher dye attached to the 3'- 
end nucleotide exhibited a larger sig- 
nal In the 5' nuclease PCR assay than 
the Internally labeled probes. It If 
proposed that the larger signal Is 
caused by Increased likelihood of 
cleavage by Taq DNA polymerase 
when the probe Is hybridized to a 
template strand during PCR. Probes 
with the quencher dye attached to 
the 3 '-end nucleotide also exhibited 
an Increase In reporter fluorescence 
Intensity when hybridized to a com- 
plementary strand. Thus, oligonucle- 
otides with reporter and quencher 
dyes attached at opposite ends can 
be used as homogeneous hybridiza- 
tion probes. 



homogeneous assay for detecting 
the accumulation of specific PCR prod- 
uct that uses a double-labeled fluoro- 
genic probe was described by Lee et al. (l) 
The assay exploits the 5' -> 3' nucle- 
olyric activity of Taq DNA poly- 
merase^ 35 and Is diagramed in Figure 1. 
The fluorogenic probe consists of an oli- 
gonucleotide with a reporter fluorescent 
dye, such as a fluorescein, attached to 
the 5' end; and a quencher dye, such as a 
rhodamine, attached internally. When 
the fluorescein is excited by irradiation, 
its fluorescent emission will be 
quenched if the rhodamine is close 
enough to be excited through the pro- 
cess of fluorescence energy transfer 
(FET).< 4 - S > During PCR, if the probe is hy- 
bridized to a template strand, Taq DNA 
polymerase will cleave the probe be- 
cause of its inherent 5' -* 3' nucleolytic 
activity. If the cleavage occurs between 
the fluorescein and rhodamine dyes, it 
causes an increase in fluorescein fluores- 
cence Intensity because the fluorescein 
is no longer quenched. The increase in 
fluorescein fluorescence intensity indi- 
cates that the probe-specific PCR product 
has been generated. Thus, FET between a 
reporter dye and a quencher dye is criti- 
cal to the performance of the probe in 
the 5' nuclease PCR assay. 

Quenching is completely dependent 
on the physical proximity of the two 
dyes. (6) Because of this, it has been as- 
sumed that the quencher dye must be 
attached near the 5' end. Surprisingly, 
we have found that attaching a rho- 
damine dye at the 3' end of a probe 
still provides adequate quenching for 
the probe to perform in the 5' nuclease 



PCR assay. Furthermore, cleavage of this 
type of probe is not required to achieve 
some reduction in quenching. Oligonu- 
cleotides with a reporter dye on the 5' 
end and a quencher dye on the 3' end 
exhibit a much higher reporter fluores- 
cence when double-stranded as com- 
pared with single-stranded. This should 
make it possible to use this type of dou- 
ble-labeled probe for homogeneous de- 
tection of nucleic acid hybridization. 



MATERIALS AND METHODS 
Oligonucleotides 

Table 1 shows the nucleotide sequence 
of the oligonucleotides used in this 
study. Linker arm nucleotide (LAN) 
phosphoramidite was obtained from 
Glen Research. The standard DNA phos- 
phoramidites, 6-carboxyfluorescein (6- 
FAM) phosphoramidite, 6-carboxytet- 
ramethylrhodamine succinimidyl ester 
(TAMRA NHS ester), and Phosphalink 
for attaching a 3'-blocking phosphate, 
were obtained from Perkin-EImer, Ap- 
plied Biosystems Division. Oligonucle- 
otide synthesis was performed using an 
ABI model 394 DNA synthesizer (Applied 
Biosystems). Primer and complement 
oligonucleotides were purified using 
Oligo Purification Cartridges (Applied 
Biosystems). Double-labeled probes were 
synthesized with 6-FAM-labeled phos- 
phoramidite at the 5' end, LAN replacing 
one of the T's in the sequence, and Phos- 
phalink at the 3' end. Following de- 
piotection and ethanol precipitation, 
TAMRA NHS ester was coupled to the 
UN-containing oligonucleotide in 250 
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Polymerization 
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FIGURE 1 Diagram of 5' nuclease assay. Stepwise representation of the 5' - 3' nucleoiytic ac- 
tivity of Taq DNA polymerase acting on a fluorogenic probe during one extension phase of PCK 



mM Na-bicaibonate buffer (pH 9.0) at 
room temperature. Unreacted dye was 
removed by passage over a PD-10 Sepha- 
dex column. Finally, the double-labeled 
probe was purified by preparative high- 
performance liquid chromatography 
(HPLC) using an Aquapore C 8 220x 4.6- 
mrn column with 7-u.m particle size. The 
column was developed with a 24-min 
linear gradient of 8-20% acetonitrile in 
0.1 m TEAA (triethylamine acetate). 
Probes are named by designating the se- 
quence from Table 1 and the position of 
the LAN-TAMRA moiety. For example, 
probe Al-7 has sequence Al with LAN- 
TAMRAat nucleotide position 7 from the 
5' end. 



PCR Systems 

All PGR amplifications were performed 
in the Perkin-Elmer GeneAmp PCR Sys- 
tem 9600 using 50-u.l reactions that con- 
tained 10 mM Tris-HCl (pH 8.3), 50 mM 
KC1, 200 uw dATP, 200 um dCTP, 200 uw 
dGTP, 400 pM dUTP, 0.5 unit of AmpEr- 
ase uracil N-glycosylase (Perkin-Elmer), 
and 1.25 unit of AmpliTaq DNA poly- 
merase (Perkin-Elmer). A 295-bp seg- 
ment from exon 3 of the human p-actin 



gene (nucleotides 2141-2435 in the se- 
quence of Nakajlma-U|iroa et aL) (7) was 
amplified using primers AFP and ARP 
(Table 1), which are modified slightly 
from those of du Breuil et al. (B) Actin am- 
plification reactions contained 4 mM 
MgCl 2 , 20 ng of human genomic DNA, 
SO nM Al or A3 probe, and 300 nM each 



TABLE 1 Sequences of Oligonucleotides 



primer. The thermal regimen was 50°C 
(2 min), 95°C (10 min), 40 cycles of 9S°C 
(20 sec), 60°C (1 min), and hold at 72°C. 
A SIS-bp segment was amplified from a 
plasmid that consists of a segment of A 
DNA (nucleotides 32,220-32,747) in- 
serted in the Smal site of vector pUC119. 
These reactions contained 3.5 mM 
MgCl* 1 ng of plasmid DNA, 50 nM P2 or 
PS probe, 200 nM primer F119, and 200 
nM primer R119. The thermal regimen 
was 50°C (2 min), 95°C (10 min), 25 cy- 
cles of 95*C (20 sec), 57°C (1 min), and 
hold at 72°C. 



Fluorescence Detection 

For each amplification reaction, a 4G-uJ 
aliquot of a sample was transferred to an 
individual well of a white, 96-well mlcro- 
titer plate (Perkin-Elmer). Fluorescence 
was measured on the Perkin-Elmer Taq- 
Man LS-50B System, which consists of a 
luminescence spectrometer with plate 
reader assembly, a 485-nm excitation fil- 
ter, and a S15-nm emission filter. Excita- 
tion was at 488 nm using a 5-nm slit 
width. Emission was measured at 518 
nm for 6-FAM (the reporter or R value) 
and 582 nm for TAMRA (the quencher or 
d value) using a 10-nrn slit width. To 
determine the increase in reporter emis- 
sion that is caused by cleavage of the 
probe during PCR, three normalizations 
are applied to the raw emission data. 
First, emission intensity of a buffer blank 
is subtracted for each wavelength. Sec- 
ond, emission intensity of the reporter is 



Name 


Type 


F119 


primer 


R119 


primer 


P2 


probe 


P2C 


complement 


PS 


probe 


P5C 


complement 


AFP 


primer 


ARP 


primer 


Al 


probe 


A1C 


complement 


A3 


probe 


A3C 


complement 



Sequence 



ACCCACAGGAACTGATCACCACTC 
ATGTCGCGTTCCGGCTGACGTTCTGC 
TCGCATTACTGATCGrrGCCAACCAGTJ) 
GTACTGGTTGGCAACGATCAGTAATGCGATG 

CGGA'ITTGCrGGTATCTATGACAAGGATip 

TrCATCCTTGTCATAGATACCAGCAAATCCG 

TCACCCACACTGTGCCCATCTACGA 

CAGCGGAACCGCrCATTGCCAATGG 

AIGCXCrCCCCCATGCCATCCTGCGTp 

AGACGCAGGATGGCATGGGGGAGGGCATAC 

CGCCCTGGACTTCGAGCAAGAGATp 
CCATCTCTTGCTCGAAGTCCAGGGCGAC 



For each oligonucleotide used in this study, the nucleic aad sequence » given, written i« the 
5' -> 3' direction. There are three types of oligonucleotides: PCR primer, fluorogenic probe * u ed 
in the 5' nuclease assay, and complement used to hybridize to the corresponding probe. For the 
probes, underlined base indicates a position where UN with TAMRA attached was substi- 
tuted fof a T. (p) The presence of a 3' phosphate on each probe. 
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A1-2 RAQGCCCT\:CCCCATCXrATCCTXXX3Tp 

A1-7 RATGCCCQaXXTATGCCATCCTGCGT^ 

AM 4 RatgccctcxicccaQgccatcctgcg^ 

A1-19 R\TGCCCTCCCX:CATGCCAQCXrrGCXJTp 

A1-22 RATGCXXTCCCCCATGCCATCC<>3CGTp 

A1-26 lUlXXCCTCrcCCATK^ATCCTGCGQp 



Probe 


518 nm 


582 


nm 


RQ- 


RQ* 


ARQ 




no temp. 


♦ temp. 


no temp. 


♦ temp. 








A1-2 


25.5 i 2.1 


32.711.9 


38.213.0 


38.212.0 


0.6710.01 


0.6610.0G 


0.19 + 0.06 


A1-7 


. 53.5 ± 4.3 


395.1 ±21.4 


108.516.3 


110.315.3 


0.4910.03 


3.5810.17 


3.0910.18 


A1-14 


127.014.9 


403.5119.1 


109.715.3 


93.116.3 


1.1510.02 


4.3410.15 


3.1810.15 


At -19 


187.5117.9 


422.7 1 7.7 


70.317.4 


73.012.8 


26710.05 


5.8010.15 


3.1310.16 


A1-22 


224.6 + 9.4 


482.2143.6 


100.014.0 


96.219.6 


22510.03 


5.0210.11 


2.7710.12 


A1-26 


160.218.9 


454.1118.4 


93.115.4 


90.713.2 


1.7210.02 


5.0110.08 


3.2910.08 



FIGURE 2 Results of 5' nuclease assay comparing p-actin probes with TAMRA at different nucle- 
otide positions. As described in Materials and Methods, PCR amplifications containing the in- 
dicated probes were performed, and the fluorescence emission was measured at 518 and 582 nm. 
Reported values are the average ±1 s.d. for six reactions run without added template (no temp.) 
and six reactions run with template (+temp.). The RQ ratio was calculated for each individual 
reaction and averaged to give the reported RQ" and RQ* values. 



divided by the emission intensity of the 
quencher to give an RQ ratio for each 
reaction tube. This normalizes for well- 
to-well variations in probe concentra- 
tion and fluorescence measurement. Fi- 
nally, ARQ is calculated by subtracting 
the RQ value of the no-template control 
(RQ~) from the RQ value for the com- 
plete reaction including template 

RESULTS 

A series of probes with increasing dis- 
tances between the fluorescein reporter 
and rhodamine quencher were tested to 
investigate the minimum and maximum 
spacing that would give an acceptable 
performance in the 5' nuclease PCR as- 
say. These probes hybridize to a target 



sequence in the human p-actin gene. 
Figure 2 shows the results of an experi- 
ment in which these probes were in- 
cluded in PCR that amplified a segment 
of the f^actin gene containing the target 
sequence. Performance in the 5' nu- 
clease PCR assay is monitored by the 
magnitude of ARQ, which is a measure 
of the increase in reporter fluorescence 
caused by PCR amplification of the 
probe target. Probe Al-2 has a ARQ value 
that Is close to zero, indicating that the 
probe was not cleaved appreciably dur- 
ing the amplification reaction. This sug- 
gests that with the quencher dye on the 
second nucleotide from the 5' end, there 
is insufficient room for Taq polymerase 
to cleave efficiently between the reporter 
and quencher. The other five probes ex- 
hibited comparable ARQ values that are 



clearly different from zero. Thus, all five 
probes are being deaved during PCR am- 
plification resulting in a similar increase 
in reporter fluorescence. It should be 
noted that complete digestion of a probe 
produces a much larger increase in re- 
porter fluorescence than that observed 
in Figure 2 (data not shown). Thus, even 
in reactions where amplification occurs, 
the majority of probe molecules remain 
uncteaved. It is mainly for this reason 
that the fluorescence intensity of the 
quencher dye TAMRA changes little with 
amplification of the target. This is what 
allows us to use the 582-nm fluorescence 
reading as a normalization factor. 

The magnitude of RQ" depends 
mainly on the quenching efficiency in- 
herent in the specific structure of the 
probe and the purity of the oligonucle- 
otide. Thus, the larger RQ"* values indi- 
cate that probes Al-14, Al-19, Al-22, and 
Al-26 probably have reduced quenching 
as compared with Al-7. Still, the degree 
of quenching is sufficient to detect a 
highly significant increase in reporter 
fluorescence when each of these probes 
is cleaved during PCR. 

To further investigate the ability of 
TAMRA on the 3' end to quench 6-FAM 
on the 5' end, three additional pairs of 
probes were tested in the 5 f nuclease 
PCR assay. For each pair, one probe has 
TAMRA attached to an internal nucle- 
otide and the other has TAMRA attached 
to the 3' end nucleotide. The results are 
shown in Table 2. For all three sets, the 
probe with the 3' quencher exhibits a 
ARQ value that is considerably higher 
than for the probe with the internal 
quencher. The RQ" values suggest that 
differences in quenching are not as great 
as those observed with some of the Al 
probes. These results demonstrate that a 
quencher dye on the 3' end of an oligo- 
nucleotide can quench efficiendy the 



TABLE 2 Results of 5' Nuclease Assay Comparing Probes with TAMRA Attached to an Internal ot 3'-terminal Nucleotide 



518 nm 



582 nm 



Probe 


no temp. 


+ temp. 


no temp. 


+ temp. 


RQ" 


RQ + 


ARQ 


A3-6 
A3-24 


54.6 ± 3.2 
72.1 ± 2.9 


84.8 ± 3.7 
236.5 ± 11.1 


116.2 ± 6.4 
84.2 1 4.0 


115.6 ±2.5 
90.2 ± 3.8 


0.47 ± 0.02 
0.86 ± 0.02 


0.73 ± 0.03 
2.62 ± 0.05 


0.26 ± 0.04 
1.76 ± 0.05 


P2-7 
P2-27 


82.8 1 4.4 
113.4±6.6 


3S4.0 ± 34.1 
555.4 ± 14.1 


105.1 ± 6.4 
140.7 1 8.5 


120.4 ± 10.2 
118.7 ± 4.8 


0.79 ± 0.02 
0.81 ± 0.01 


3.19 ± 0.16 
4.68 ± 0.10 


2.4010.16 
3.88 ± 0.10 


P5-10 
P5-28 


77.5 l 6.5 
64.0 1 5.2 


244.4 ± 15.9 
333.6 ±12.1 


86.7 i 4.3 
100,6 ± 6.1 


95.8 ± 6.7 
94.7 ± 6.3 


0.89 ± 0.05 
0.63 ± 0.02 


2.55 ± 0.06 
3.S3 ± 0.12 


1.66 ± 0.08 
2.89 ± 0.13 



Reactions containing the indicated probes and calculations were performed as described in Material and Methods and in the legend to Fig. 2. 
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fluorescence of a reporter dye on the 5' 
end. The degree of quenching is suffi- 
cient for this type of oligonucleotide to 
be used as a probe in the 5' nuclease PCR 
assay. 

To test the hypothesis that quenching 
by a 3' TAMRA depends on the flexibility 
of the oligonucleotide, fluorescence was 
measured for probes in the single- 
stranded and double-stranded states. Ta- 
ble 3 reports the fluorescence observed 
at 518 and 582 nm. The relative degree 
of quenching is assessed by calculating 
the RQ ratio. For probes with TAMRA 
6-10 nucleotides from the 5' end, there 
is little difference in the RQ values when 
comparing single-stranded with double- 
stranded oligonucleotides. The results 
for probes with TAMRA at the 3' end are 
much different. For these probes, hy- 
bridization to a complementary strand 
causes a dramatic increase in RQ. We 
propose that this loss of quenching is 
caused by the rigid structure of double- 
stranded DNA, which prevents the 5' 
and 3' ends from being in proximity. 

When TAMRA is placed toward the 3' 
end, there is a marked Mg 2 " 1 " effect on 
quenching. Figure 3 shows a plot of ob- 
served RQ values for the Al series of 
probes as a function of Mg 24 " concentra- 
tion. With TAMRA attached near the 5' 
end (probeAl-2or Al-7), the RQ value at 
0 niM Mg 2 * is only slightly higher than 
RQat 10 mM Mg 2+ . For probes Al-19, 
Al-22, and Al-26, the RQ values at 0 mM 
Mg 2 * are very high, indicating a much 



reduced quenching efficiency. For each 
of these probes, there is a marked de- 
crease in RQ at 1 mM Mg 2 * followed by 
a gradual decline as the Mg 2+ concen- 
tration increases to 10 mM. Probe Al-14 
shows an intermediate RQ value at 0 mM 
Mg 2 * with a gradual decline at higher 
Mg 2 * concentrations. In a low-salt en- 
vironment with no Mg 2 * present, a sin- 
gle-stranded oligonucleotide would be 
expected to adopt an extended confor- 
mation because of electrostatic repul- 
sion. The binding of Mg 2 * ions acts to 
shield the negative charge of the phos- 
phate backbone so that the oligonucle- 
otide can adopt conformations where 
the 3' end is close to the 5' end. There- 
fore, the observed Mg 2 * effects support 
the notion that quenching of a 5' re- 
porter dye by TAMRA at or near the 3' 
end depends on the flexibility of the oli- 
gonucleotide. 

DISCUSSION 

The striking finding of this study is that 
it seems the rhodamine dye TAMRA, 
placed at any position in an oligonucle- 
otide, can quench the fluorescent emis- 
sion of a fluorescein (6-FAM) placed at 
the S' end. This implies that a single- 
stranded, double-labeled oligonucle- 
otide must be able to adopt conforma- 
tions where the TAMRA is close to the 5' 
end. It should be noted that the decay of 
6-FAM in the excited state requires a cer- 
tain amount of time. Therefore, what 



TABLE 3 Comparison of Fluorescence Emissions of Single-stranded and 
Double-stranded Fluorogenic Probes 



518 nm 



582 nm 



RQ 



Probe 


ss 


ds 


ss 


ds 


ss 


ds 


Al-7 


27.75 


68.53 


61.08 


138.18 


0.45 


0.50 


Al-26 


43.31 ■' 


509.38 


53.50 


93.86 


0.81 


5.43 


A3-6 


16.75 


62.88 


39.33 


165.57 


0.43 


0.38 


A3-24 


30.05 


578.64 


67.72 


140.25 


0.45 


3.21 


P2-7 


35.02 


70.13 


54.63 


121.09 


0.64 


0.58 


P2-27 


39.89 


320.47 


65.10 


61.13 


0.61 


5.25 


PS-10 


27.34 


144.85 


61.95 


165.54 


0.44 


0.87 


P5-2S 


33.65 


462.29 


72.39 


104.61 


0.46 


4.43 



(ss) Single-stranded. The fluorescence emissions at 518 or 582 nm foi solutions containing a final 
concentration of 50 riM indicated probe, 10 mM Tris-HCt (pH 8.3), 50 mM KC1, and 10 mM MgCl 2 . 
(ds) Double-stranded. The solutions contained, in addition, 100 nM A1C for probes Al-7 and 
Al-26, 100 nM A3C for probes A3-6 and A3-24, 100 nM P2C for probes P2-7 and P2r27, or 100 nM 
P5C for probes P5-10 and P5-28. Before the addition of MgClj, 120 uJ of each sample was heated 
at 95°C for 5 min. Following the addition of 80 uJ of 25 tom MgCl* each sample was allowed to 
cool to room temperature and the fluorescence emissions were measured. Reported values are 
the average of three determinations. 



matters for quenching is not the average 
distance between 6-FAM and TAMRA 
but, rather, how close TAMRA can get to 
6-FAM during the lifetime of the 6-FAM 
excited state. As long as the decay time of 
the excited state is relatively long com- 
pared with the molecular motions of the 
oligonucleotide, quenching can occur. 
Thus, we propose that TAMRA at the 3' 
end, or any other position, can quench 
6-FAM at the 5' end because TAMRA is in 
proximity to 6-FAM often enough to be 
able to accept energy transfer from an 
excited 6-FAM. 

Details of the fluorescence measure- 
ments remain puzzling. For example, Ta- 
ble 3 shows that hybridization of probes 
Al-26, A3-24, and P5-28 to their comple- 
mentary strands not only causes a large 
Increase in 6-FAM fluorescence at 518 
nm but also causes a modest increase in 
TAMRA fluorescence at 582 nm, If 
TAMRA is being excited by energy trans- 
fer from quenched 6-FAM, then loss of 
quenching attributable to hybridization 
should cause a decrease in the fluores- 
cence emission of TAMRA. The fact that 
the fluorescence emission of TAMRA in- 
creases indicates that the situation is 
more complex. For example, we have an- 
ecdotal evidence that the bases of the 
oligonucleotide, especially G, quench 
the fluorescence of both 6-FAM and 
TAMRA to some degree. When double- 
stranded, base-pairing may reduce the 
ability of the bases to quench. The pri- 
mary factor causing the quenching of 
6-FAM in an intact probe is the TAMRA 
dye. Evidence for the importance of 
TAMRA is that 6-FAM fluorescence 
remains relatively unchanged when 
probes labeled only with 6-FAM are used 
in the 5' nuclease PCR assay (data not 
shown). Secondary effectors of fluores- 
cence, both before and after cleavage of 
the probe, need to be explored further. 

Regardless of the physical mecha- 
nism, the relative independence of posi- 
tion and quenching greatly simplifies 
the design of probes for the 5' nuclease 
PCR assay. There are three main factors 
that determine the performance of a 
double-labeled fluorescent probe in the 
5' nuclease PCR assay. The first factor is 
the degree of quenching observed in the 
intact probe. This is characterized by the 
value of RQ* , which is the ratio of re- 
porter to quencher fluorescent emis- 
sions for a no template control PCR. In- 
fluences on the value of RQ*" include 
the particular reporter and quencher 



aru %A*rt***A* mnH Anntlmtinnt 

PAGE 5/7 * RCVD AT 10/24/2005 5:55:20 PM [Pacific Daylight Time] * SVR:SVCS01/0 » DNIS:6034 1 CS©:(613) 991-5695 1 DURATION (mm-ss): 05-16 



From C613) 991-5695 



X 



^811209DP04816741 Moo 24 Oct 200f "'35 PM £DT Page. 6 of 7 

fl//f//?eseorc/7 




mM Mg 

FIGURE 3 Effect of Mg 2 * 1 " concentration on RQ ratio for the Al series of probes. The fluorescence 
emission intensity at 518 and 582 nm was measured for solutions containing 50 nM probe, 10 mM 
Trls-HQ (pH 8.3), 50 mM KC1, and varying amounts (0-10 mM) of MgG 2 * The calculated RQ 
ratios (SIH nm intensity divided by 582 nm intensity) are piotted vs. MgCI 2 concentration (dim 
Mg). The key {upper right) shows the probes examined. 



dyes used, spacing between reporter and 
quencher dyes, nucleotide sequence 
context effects, presence of structure or 
other factors that reduce flexibility of 
the oligonucleotide, and purity of the 
probe. The second factor is the efficiency 
of hybridization, which depends on 
probe T m , presence of secondary struc- 
ture in probe or template, annealing 
temperature, and other reaction condi- 
tions. The third factor is the efficiency at 
which Taq DNA polymerase cleaves the 
bound probe between the reporter and 
quencher dyes. This cleavage is depen- 
dent on sequence complementarity be- 
tween probe and template as shown by 
the observation that mismatches in the 
segment between reporter arid quencher 
dyes drastically reduce the cleavage of 
probe/ l) 

The rise in RQ" values for the Al se- 
ries of probes seems to indicate that the 
degree of quenching is reduced some- 
what as the quencher k placed toward 
the 3' end. The lowest apparent quench- 
ing is observed for probe Al-19 (see Fig. 
3) rather than for the probe where the 
TAMRA is at the 3' end (Al-26). This is 
understandable, as the conformation of 
the 3' end position would be expected to 
be less restricted than the conformation 
of an internal position. In effect, a 
quencher at the 3' end is freer to adopt 
conformations close to the 5' reporter 
dye than is an internally placed 
quencher. For the other three sets of 



probes, the interpretation of RQ" values 
Is less clear-cut The A3 probes show the 
same trend as Al, with the 3' TAMRA 
probe having a larger RQ" than the in- 
ternal TAMRA probe. For the P2 pair, 
both probes have about the same RQ" 
value. For the P5 probes, the RQ" for the 
3' probe is less than for the internally 
labeled probe. Another factor that may 
explain some of the observed variation is 
that purity affects the RQ" value. Al- 
though all probes are HPLC purified, a 
small amount of contamination with 
unquenched reporter can have a large ef- 
fect on RQ". 

Although there may be a modest ef- 
fect on degree of quenching, the posi- 
tion of the quencher apparently can 
have a large effect on the efficiency of 
probe cleavage. The most drastic effect is 
observed with probe Al-2, where place- 
ment of the TAMRA on the second nu- 
cleotide reduces the efficiency of cleav- 
age to almost zero. For the A3, P2, and PS 
probes, ARQ is much greater for the 3' 
TAMRA probes as compared with the in- 
ternal TAMRA probes. This is explained 
most easily by assuming that probes 
with TAMRA at the 3' end are more likely 
to be cleaved between reporter and 
quencher than are probes with TAMRA 
attached internally. For the Al probes, 
the cleavage efficiency of probe Al-7 
must already be quite high, as ARQ does 
not increase when the quencher is 
placed closer to the 3' end. This illus- 



trates the importance of being able to 
use probes with a quencher on the 3' 
end in the 5' nuclease PCR assay. In this 
assay, an increase in the intensity of re- 
porter fluorescence is observed only 
when the probe is cleaved between the 
reporter and quencher dyes. By placing 
the reporter and quencher dyes on the 
opposite ends of an oligonucleotide 
probe, any cleavage that occurs will be 
detected. When the quencher is attached 
to an internal nucleotide, sometimes the 
probe works well (Al-7) and other times 
not so well (A3-6). The relatively poor 
performance of probe A3-6 presumably 
means the probe is being cleaved 3' to 
the quencher rather than between the 
reporter and quencher. Therefore, the 
best chance of having a probe that reli- 
ably detects accumulation of PCR prod- 
uct in the 5' nuclease PCR assay is to use 
a probe with the reporter and quencher 
dyes on opposite ends. 

Placing the quencher dye on the 3' 
end may also provide a slight benefit in 
terms of hybridization efficiency. The 
presence of a quencher attached to an 
internal nucleotide might be expected to 
disrupt base-pairing and reduce the T m 
of a probe. In fact, a 2°C-3°C reduction 
in T m has been observed for two probes 
with internally attached TAMRAs. <9) This 
disruptive effect would be minimized by 
placing the quencher at the 3' end. Thus, 
probes with 3' quenchers might exhibit 
slightly higher hybridization efficiencies 
than probes with internal quenchers. 

The combination of increased cleav- 
age and hybridization efficiencies means 
that probes with 3' quenchers probably 
will be more tolerant of mismatches be- 
tween probe and target as compared 
with internally labeled probes. This, tol- 
erance of mismatches can be advanta- 
geous, as when trying to use a single 
probe to detect PCR-amplified products 
from samples of different species. Also, it 
means that cleavage of probe during PCR 
is less sensitive to alterations in an- 
nealing temperature or other reaction 
conditions. The one application where 
tolerance of mismatches may be a disad- 
vantage is for allelic discrimination. Lee 
et al. (1> demonstrated that allele-spedfic 
probes were cleaved between reporter 
and quencher only when hybridized to a 
perfectly complementary target. This al- 
lowed them to distinguish the normal 
human cystic fibrosis allele from the 
AFS08 mutant Their probes had TAMRA 
attached to the seventh nucleotide from 
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the 5 f end and were designed so that any 
mismatches were between the reporter 
and quencher. Increasing the distance 
between reporter and quencher would 
lessen the disruptive effect of mis- 
matches and allow cleavage of the probe 
on the incorrect target. Thus, probes 
with a quencher attached to an internal 
nucleotide may still be useful for allelic 
discrimination. 

In this study loss of quenching upon 
hybridization was used to show that 
quenching by a 3' TAMRA is dependent 
on the flexibility of a single-stranded oli- 
gonucleotide. The increase in reporter 
fluorescence intensity, though, could 
also be used to determine whether hy- 
bridization has occurred or not Thus, 
oligonucleotides with reporter and 
quencher dyes attached at opposite ends 
should also be useful as hybridization 
probes. Hie abUity to detect hybridiza- 
tion in real time means that these probes 
could be used to measure hybridization 
kinetics. Also, this type of probe could be 
used to develop homogeneous hybrid- 
ization assays for diagnostics or other ap- 
plications. Bagwell et al. (,0) describe just 
this type of homogeneous assay where 
hybridization of a probe causes an in- 
crease in fluorescence caused by a loss of 
quenching. However, they utilized a 
complex probe design that requires add- 
ing nucleotides to both ends of the 
probe sequence to form two imperfect 
hairpins. The results presented here 
demonstrate that the simple addition of 
a reporter dye to one end of an oligonu- 
cleotide and a quencher dye to the other 
end generates a fluorogenic probe that 
can detect hybridization or PCR amplifi- 
cation. 
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Wc have developed a novel "real time" quantiiaUve PCR method. The method measures FCR. product 
accumulation through a dual-labeled fluorosenlc probe (Lc., TaqMan Probe). This method provides very 
accurate and reproducible Quantitation of &enc copies. Unlike other quantitative PCR methods, real-time PCR 
does nor require posvPCR sample handling, preventing potential PCR product carry-over contamination and 
resulting In much faster and higher throughput assays. The real-time PCR method has a very large dynamic 
range of starting target molecule determination (at leau five orders of magnitude). Real-time auantitarlvc 
PCR is extremely accurate and less labor-intensive than current quantitative PCR methods. 



Quantitative nucleic acid sequence analysis has 
had an important role in many fields of biologi- 
cal research. Measurement of gene expression 
(RNA) has beam used extensively In monitoring 
biological responses to various stimuli (Tan ct al, 
1994; Huang ct al. 1995a,b; Prud'homme et at. 
1995). Quantitative gem* analysis (DNA) has 
Ix-cn used to determine the genuine quantity of 
particular gene, as in the case or the human HER2 
gene, which Is amplified in : -30% of breast tu- 
mors (Slamon ui al. 1987). Gene and genome 
quantitation (DNA and UNA) also have been used 
for analysis of human immunodeficiency virus 
(JiJV) bmden demonstrating changes in the lev- 
els of vl rus throughout the different phases of the 
disease (Connor e.t al. 1993; Platak et al. jyy:st>; 
J'urtado et ai. 1995). 

Many methods have been described for the 
quantitative analysis of nucleic acid sequences 
(both for RNA and DNA; Southern IV rb\ Sharp et 
al. 1980; Thomas 1980). Recently, PCR has 
proven to be a powerful tool fOT quantitative 
nucleic acid analysis. PCR and reverse transcrip- 
tase (K'lVPCR have permitted the analysis of 
minimal starting quantities of nucleic acid (as 
little a* one cell equivalent). This has mod* pos- 
sible many experiments that could not have been 
performed with traditional methods. Although 
PCR has provided a powerful tool, it is imperative 



3 C»~«s04tMJlr»g Author. 

RTOlBl 



thai It be used properly for quantitation (Uauy- 
maekm 1995). Many early reports of quantita- 
tive PCR and RT-PCR described quantitation of 
the PCR product but did not measure the Initio) 
target sequence quantity. It is essential to design 
proper controls for Ihe quantitation of the initial 
target .sequences (Hcrrc 1992; Clcmentl et al. 
1003) " - 

Ki'Nrttrchcxs have developed several methods 
of quantitative PCR and KT-PCR. One approach 
measures PCR product quantity in the log phase 
of the read Ion before the plateau (Kellogg ct al. 
1990; Pang ct a). 1990). This method requires 
thai each sample has equal input amounts of 
nucleic add and that each sample under analysis 
amplifies with idenl ica.1 efficiency up to the. point 
of quantitative analysis. A gene sequence (con- 
tained in all .samples at relatively constant quan* 
title*, such as p-aotln) can be used for sample*, 
amplification efficiency normalization. Using 
conventional methods of PCR detection and 
quantitation (gel electrophoresis or plate capture 
hybridization), it is extremely laborious to assure 
that all samples nre. analyzed during the log phase 
of the reaction (for both the target gene and the 
normalization gene). Another method, quanlita* 
live competitive (QC)-PCR, has been developed 
and i>s used widely for PCR quantitation. QC-PCR 
n:IJcs on the inclusion of an internal control 
competitor in each reaction (Becker-Andre 1991; 
Hatak et ah I993«ji>). The efficiency of each re- 
action Is normalised to the Irilcrnol compel tior, 
a known amount of int«jnaJ competitor can be 
anurv yncfl no; «*« wj «c:frT 7nn7/cn/7i 
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oddod to each sample. To obtain relative quant- 
totlon, ihc unknown target PGR product is cum- : 
pored with the known competitor iK'M product. 
Success of a quantitative competitive PCU assay 
relies oh acvcloping an internal euiilrul that am- 
jiliriirs with the same efficiency as tliv tuiget juoI- 
cculc. The design of the coiupetJtoi and the vii)l- 
cailon of amplification efficiencies jequire a 
dedicated effort. However, because QCMKIR does 
i lot rcq uirc I ha t PC Ml j > u >d u c is be a i m lyzitd d u ri ng 
the lo$ phase of 11 ks amplification, i1 is this easier 
<.#f the two methods to use. 

Several detection system* iiiv used for quan 
Utative ICR and RT-I»CU analysis; (3) agarose 
gels, (2) fluorescent labeling of FOR products and 
detect Ion with Inner-induced fluorescence ualng 
capillar)' electrophoresis (h'usco ct ah 1995; Wil- 
liams ct al. 1996) or aerylaiulde gels, and (3) jihtic 
capture and sandwich probe hybrid 1/41 1 ion (Mul- 
der csi ah 1994). Although these methods proved 
successful, each method requires posl-l'CR ma- 
aipulciTlons That add time to the analysis ami 
may lead to Jabvatoiy i niilrtiniiiaiion. The 
sample throughout of these method* i> limited 
(wl|J> Ihir exception of the plate capture ap- 
proach), nnd, therefore, these methods, ore not 
well suited ftu u>wj» demanding high sample 
Throughput (I.e., screening of large numbers of 
b!oi Molecule* i*i unatyy.lil^ Samples fwj diagnos- 
tics or clinical I rials). 

Here we report the development of a novel 
iitt&y for cmanthative TWA analysis. The assay is 
based on Die use. of the 5* nuclease assay first 
described l>y Holland et al, (1993). The method 
uses the- -V nuclease Activity of 7W</ polymerase to 
cleave a noncxtcndJhlc hybridization probe dur- 
ing the extension phase of I'CU. The, approach 
uses dual-labeled fluorogenic hybrid! /.at J on 
probes (Lcc ct nl. 1993; dossier ct al. 1993; MvoN 
et nl, J$9fio,b). One fluorescent dye nerves as a 
reporter {FAM (i.e., (S-carbqxy fluorescein)! nnd its 
emission spectra is quenched by the second fluo- 
rescent dye, TAMRA (I.e., w-carboxy-ietramethyl- 
rhodamtnc). Tlic nuclease degradation of the hy- 
hrldiyaitlon probe releases the quenching of the 
I'AM fluorescent emission, resulting; jn an In- 
crease tn peak fluorescent ernlssJon at S3g nm. 
The use of a sequence detector (AUJ Prism) allows 
measurement of fluorescent .spectra of all 96 wells 
uf the thermal cycler continuously during the 
i*CK amplification. Therefore, the reactions a;c 
nmnltored in nraL time; The output data is de- 
scribed and quant hat I Ye analysis of input target 
UNA sequences 15 discussed below. 
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RESULTS 

PCR Product Derccrlon in Real Time 

The goal was to develop a high-throughput, sen- 
sitive, and accurate gene quant hat Ion assay for 
use b> monitoring Jlpid mediated therapeutic 
gene delivery. A plastxild unending human factor 
VIII gene sequence, pF8TM (see. Methods), was 
used as a model therapeutic Ke.nc. The assay uses 
fluorescent Taqmun methodology and an instru- 
ment capable of measuring fluorescence in real 
lime (Alii Prism 7700 Sequence Hcterfnr). Ilie 
Taqman reaction requires a hybridisation pm-hr* 
lalxrled witii two different fluorescent dyes. One 
dye Is a reporter dyu (FAM), the other is X quench- 
ing dye (TAMRA). When the pruU: Is intact, fluo- 
i esc en t energy transfer occurs and the reporter 
dye fluorescent emission is ubsorbed by the 
quenching dye (TAMRA). During tlie extension 
phase of the 1'CK cycle, the fluorescent hybrid- 
1/iillOii probe Is cleaved by the 5'-.T nuclcolytic 
activity of the DNA polymerase. On cleavage of 
the probe, the reporter dye emission is no longer 
transferred efficiently to the quenching dye. re 
suiting In an increase of the reporter dyu fluores- 
cent enn.ixlon spectra, PGR primers and prober* 
were designed foi the human factor VJ1J se- 
quence and human p- act In gene (a.% dwcnlied in 
Methods). Optimization reactions were per- 
formed to choose the «pproprlute probe and 
magnesium concentrations yielding the lu>;lu^t 
Intensity of reporter fiuoresecnt signal without 
suerificing specificity. The Instrumcml uses a 
charge-coupled device (i.c„ CCD camera) for 
measuring the fluorescent emission speelm from 
.100 to C$0 niti. Macli VCM tube was monitored 
sequentially for 25 msec with continuous moni- 
toring throughout tin: oniplificutioii. Uach lube 
won rr-exan lined every B»5 sec. Computer sofl- 
wure. was designed to examine die fiuoresecnt In- 
tensity of both the reporter dye (PAW). and 
the quenching dye (TAMRA), The fluorescent 
intensity of the quenching dye, TAMRA, changes 
very Utile over the course of the PCR ampllfh 
cation (data not shown). Therefore, the Intensity 
of TAMRA dye emission serves as an internal 
tlandard with which to norrnull>x; the reporter 
dye (FAM) emission variations. The software cal- 
culates a value termed ARn (or MKi) using the. 
following equation: dRn - (Hn J ) (n»i")» where 
Rn 4 . ernisslun ujlunsily «>f reporter/emission in- 
tensity of quencher at any given time in a rcac 
flon tube, and Rai r- enaission intensitity of re* 
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poncr/CmlSSiOI) llitemily uf quencher measured 
prior 10 rCK ainplilication in I hat same reaction 
tube. J ; or the purpose of quantitation, the Um 
three data points (ARns) collected during the ex* 
tension step for each J J c:K cycle were analysed. 
Tlic micleolyiic degradation of the. nyuridjK*iian. 
probe occurs during the extension phase or rut, 
and, therefore, reporter fluorescent cniKiMun In- 
creases during this .time. 'J" 1 "- daw point* 
were averaged for cacJi KJK cyde and the mean 
value fur each was plotted in an "amplification 
plot" shown In J'ltfurc 1A. The Alto mean value is 
plotted on the }*&xis, and time, represented by 
cycle number, is plolledon the*-axis. During the 
«ariy cycles of the PCU amplification, th* ARn 



value remains at base line When sufficient hy- 
brid j /-all on probe has been cleaved by the Tmj 
]x>lymcrasc inirleafio activity, the intensity of re- 
porter, fluorescent emission Increase*, \4i*i«;t 1>0K 
amplifiv.Mjom reach » plateau phone of reporter 
fhiurocv.nl crrnfifiton If the reauliun Is carried out 
io high cycle uujjtbcis. The amplifiralion plot h 
exuiiiiiuttl uaily in lh* reaction, at a point lhai 
iej>jcseni5 the log phase of product arrmnuta* 
lion. This is done by assigning an arbibajy 
threshold thai is based on the variability of the 
base-line dyU. fn Vlgcm 1A, the Ihfishold wasxet 
ai in standard deviations above ihc mean of 
hafte lino eniisxum calculated from trydca 1 to 1 5. 
Once the threshold Is chosen, the point at which 
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figure 1 PCR product detection in real time. (A) The Model 7700 ^uflware will const ruct amplification pleti 
from the extension phase fluorescent emission data collected during the PCR amplification. The standard de- 
viation is determined from the data points collected from the base line of the amplification plot C, values are 
calculated by determining the point at which the fluorescence exceeds a threshold limit (usually 10 times the 
standard deviation of the base line). (B) Overlay of amplification plots of serially (1:2) diluted human genomic 
DMA samples amplified with p-actin primers. (Q Input DNA concentration of the samples plotted versus C T . All 
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the amplification plot crosseo the thresholds elf 
fined as C P C,- is reported a* the cycle number U1 
this point. Ac will be demonstrated) the CI, .value 
pieUiudv* of the quantity of input target. 

Cj Values Provide a Quantitative Measurement, of 
Input Targer Sequences 

Figure IB shows amplification plots of l&«diT)hv. 
tail PCR amplifications overlaid. The amplify* 
Hens were performed on a 1:2 serial dilution >uB 
human genomic PNA. 'ilic amplified targel w:u 
human 3 octln. The amplification pi oft Khifl to 
the right (to higher threshold cycles) ns the input 
target quantity is reduced. Thic is expected he 
c*mu nmcttloriR with fewer starting enpinx of t))0 
target molecule require, greater amplification to 
degrade enough probe to attain the threshold 
fluorescence. An arbitrary threshold of 10' stan- 
dard deviations above the base line was used to 
determine the G r values. Figure 1C represents the 
C r value* plotted versus the sample dilution 
value, Each dilution was amplified in triplicate 
PCR amp lift cations and plotted as mean values 
with error burs representing one standard devia- 
tion. The C T values decrease linearly wjth increas- 
ing target quantity, Thus, c; r valuta ran be used 
as a quantitative measurement of the input target 
number. It should be noicd that the amplifica- 
tion plot for the 15.6. ng sample shown In Figure 
Ifl does not reflect the same fluorescent rate of 
increase exhibited by most of the other samples. 
The 15.6-ng sample also achieves cndpolni pla- 
teau at a lower fluorescent vaJuc than would be 
expected based on the input DNA. This phenom- 
cnon has been observed occasionally with nther 
samples (data not shown) and may be attribute 
able to. lute cycle inhibition; this hypothesis is 
still under investigation. It is important to note 
that the flattened slope and early plateau do not 
impact significantly the calculated C n value us 
demonstrated by the Hi on ihe line shown in 
Figure l C All triplicate amplifications resulted in 
very similar Cr values— the standard deviation 
did not exceed 0.5 for any dilution. 'Thin experi- 
ment contains a > ) 00,000-fold range of input tar- 
get molecules. Using Cv values for quantitation 
permits a much larger assay range than directly 
using total fluorescent emission intensity for 
quantitation. The linear range. oi iluorcsccni in- 
tensity measurement of ihc ABI Prism 7700 Sc- 
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mohts over a very large raiij>c« of rflafivp ciMrttr 
largest quantities. 

Sample fteparatlon Validation 

Several parameters influence the cfUeU : nry , 
PCR umplification; magnesium and Sail conce: 
nations, reaction conditions (i.e., time and lev 
pe.rature), PCM target size and compositio: 
primer sequences, and sample puriTy. All of t) 
above (actors are. common to a single PCR assa 
except sample to sample purity. In an effort 
validate (be. method of sample preparation fi 
thclacior Viii assay, PCK amplification reprodu 
ihility and oiflciency 01 10 replicate samp 
prefKiratiom were examined. After genomic ON 
was prepared from the 10 replicate samples, 11 
DNA was quamhalcd by ultraviolet spcciroscop 
Ampliricallons were performed analyzing p-ncil 
Kian: content in 100 and 25 ny> of total xenom 
1WA. Each PCR amplification was performed i 
triplicate. Comparison of C r values for each tnj 
licate sample show minimal variation based o 
standard deviation and coefficient of varianc 
(Tabic 1). Therefore, each oi the triplicate PC 
amplifications was highly reproducible, demor 
Straiing that real time PCK using this instrumcr 
in lion introduces minimal variation Into th 
quantitative. PGR analysis. Comparison of th 
mean C n values of the 10 replicate sample preps 
rations also showed minimal variability, indicaj 
ing that each sample preparation yielded simjia 
results for H-actln gone quantity. The highest C 
difference between any of rhe samples was 
and 0.7] for the 10O and 25 ng samples, respet 
llye.Iy. Additionally, the. amplification of t:acJ 
sample exhibited an equivalent rate of fluorcv 
cent emission intensity change per amount o 
DNA target analyzed ns indicated by simila: 
slopes derived from I be sample dilutions (Fig. 2) 
Any sample containing an excess of a PCX inhibf. 
tor would exhibit a greater measured 0-acUn C 
value for a given quamiiy of DNA. In addition! 
the Inhibitor would be diluted along with the 
sample in the dilution analysis (j-ifl. z) t altering 
the expected C, value change. Ruch .sample am- 
plification yielded a similar result in the analysis 
demonstrating that this method of sample prepa- 
ration is highly reproducible with regard to 
sample purity. 

Quantitative Analysis of a Plasmid After 

7nc« no/ «f>R wj «c:i»t 7nn7/cn/7T 
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Table 1, Reproducibility of Sample Preparation Method 



Samplo 



no. 



2 
3 
4 
5 



7 
8 
9 
10 

Mean 



100 ng 



standard 
mean deviation 



CV 



18.24 

18.23 

10.33 

18.33 

18.35 

1R.44 

18.3 

18.3 

1B,42 

18.15 

18.23 

18.32 

18.4 

18.38 

18.46 

18.54 

18.67 

19 

18.2B 

18.36 

1832 

18.45 

1B.7 

18.73 

18.18 

18.34 

18.26 

18.42. 

18.57 

1 8.66 
(1 10) 



lv.27 0.06 



0.06 



18.34 0.07 



18.23 0.0S 



1BM2 0.04 



18.7-1 0.21 



1839 



1 fi.SS 
18.12 



0.12 



18.63 0.16 



18.29 0.1 



0.12 
0.17 



0.32 

0,37 

0.36 

0.46 

0.23 

1.26 

0.66 

0.83 

0,55 

0.6S 
0,90 



20.48 

20.55 

20,5 

20.61 

20.59 

70.41 

20.54 

20.6 

20.49 

20.48 

20.44 

20.38 

20.68 

20.87 

20,63 

21,09 

21.04 

21.04 

20.67 

20.73 

20.6S 

20.98 

20.84 

20.75 

20,46 

20.54 

20.48 

20.79 

20.78 

20.62 



25 ng 



standard 
mean deviation CV 



20,51 0.03 0,17 

P0..U 0.11 0.54 

20.54 0.06 0,26 

20.43 0.05 0.26 

20.71 0.13 0.61 

21.06 0.03 0.15 

20.68 0.04 0.2 

20.86 0.12 0.57 

20.51 0.07 0.32 

20.73 0.1 0.16 

20.66 0.19 0.94 



(or containing a partial cDNA for human factor 
vill, pi'BTM. A scries of transections was sot 
up using a decreasing amoum of the plasmid v (40, 
4, 0.5, and ().l m-6). Twr.niy-rouT hours ptm- 
trnnsfertion, total n>NA w<i* purified from each 
flask uf cells. p-Aclin gene quantity was cIiumtii as 
a value for normalifc&liwti of genomic DNA con- 
centration from each sample. In this expedient, 
p-actin gene content should remain constant 
relative to roral genomic DNA. Figure 3 shows tljc 
result of the p-actin DNA measurement (100 ng 
total DNA determined by ultraviolet spectros- 
copy) oi each sample. Kach sample was analysed 
in triplicate and the mean p-actin t'^ values of 
the triplicates were plotted (error bars represent 

r.~r% r+viftftfrt riwiahoni I h#» htPhfftr niffrrrnrr 



bvtw<teii any iwo samplo moans wax 0.515 C,, Ten 
nanograms of total UNA uf each sample were also 
rxauthicnl for p-aciln. The results again .showed 
that very similar amounts of genomic 1>NA were: 
present; tin: maximum mean p serin C t value 
difference wa.s 1 .0. A3 Figure 3 shows, the rate of 
P-aetin C|. dian^c between the 500 and 10- ng 
5ajnx>1<=* was similar (slope values range* bwfwoon 
3,56 and -3.45). Trn'« verifies again \hni the 
method of .sample preparation yields samples of 
Identical PCR integrity (j-<'-. no sample contained 
nn excessive amount of a PCR inhibitor). How- 
ever, these rusults indicate that cacll sample con- 
tained slight differences in the adual amount of 
genomic DNA aualyxcd. Determination of actual 
ttuiiuunc DNA concent ration wos accomplished 



( 



( 



PHONE No. : 310 472 0905 



Dec. 05 2002 12:24AM PI 



m Al TIMI- QUANTITATIVE PCU 





21* 




21- 










> 




1 

tn. 






10 








id 




1.3 1.4 1,6 1.0 1.7 1JJ lj> ' 2 M 

tog (ng Input genomic DNA) 
Figure 2 Soi i tple preparation purity. 1 he replicato 
camples shown In Table 1 woro also amplified In 
tripicate using 2S ng of each DNA sample. The fig* 
uifc shows die input DNA concentration (100 and 
25 ng) vs. C, In ih#» ti^nrp, ihe 1O0 and ?S ng 
points for *ach sample are connected by a line. 



by plotting the mean fi-actio C, value obtained 
for each 100llg sample uu .1 p-actln standard 
curve (shown In Wh- The actual guuumle 
1>NA concent rut ton of cocb sum pic, rr, was ob 
tallied l7y extrapolation to the 

Figure 4 A shows the .measured (l.u. f noil* 
normalised) quantities of /actor VI)J pltmmid 
ONA (pFSTM) from each* of the four transient cell 
tnt i ^factions, Each react tan contained 100 ng of 
total sample. 13NA (as determined by U V spectrum 
copy). l&ch sample was analyzed in triplicate 
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V "27.79 ^ <f Writ. 1 
y • 47.72 **a.45sARi 
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14 M 

log (ng tnput DNA) 

Figure 3 Analybli of U ansfectcd cdl DNA quantity 
and purity. I he DNA preparations of hV four 293 
cell transactions (40, A, 0.5, and 0,1 u.g of pF8TM) 
were analy7ed for rhc P-actln gene. 100 and 10 ng 
(determined by ultraviolet spectroscopy) of each 
sample were amplified in triplicate. For each 
amount of pF8TM that was transacted, the {i-actln 
C T values are plotted versus the total Input DNA 



PC.rt amplifications. As shown, pPBTM purified 
>fujic Jbc 293 cells decrease?; (mean C, values in- 
cutiuej with decreasing amounts of pi ami Id 
itrmtsli'Ucd. Thv mean C l values obtained foi 
prVTW •mTlguTC 4A were piotted ou a standard 
uurve comprised uf seilally diluted pFHTM, 
shown .in figure 4B. Vhu quantity uJ pl»KTM, b t 
found in each of the four transections was de- 
termined by extrapolation to the x axis of the 
standard curve in Figure 4H. 'Dutse uncorrected.' 
values, b, for pFBTM were nor mailed to deicr- 
mine the actual amount of pl'8'lM found per 1(X 
riK of genomic DNA by using Ihe equation:. 

/> x 10 0 ng uciual pPBTM copies oer 
a ~ 100 ng of genomic UNA 

where a --actual genomic DNA in u .sample and 
f> «- pFH'lTvl copies from the standard curve. 'Die 
normoJir.cd quantity of pl'STM per 100 ng of gc- 
nomic DNA for each of the four Iran* feci I oils Is 
snown lii Figure 4JJ. 'Iliate rcaulta Miow thai ihc 
quantity nf factor VU1 plasmlU associated wiih 
the Z93 cellM, 24 I ir after tnmsfedion, dut.u:.jsc!i 
with decreasing pJasmuJ uuK.wnuaUoij used in 
the transection. Hit: quantity of pl'bl'M nuocJ- 
utcd with 293 cells, after uunsfcctlon with 40 u,g 
of piasmid, was 35 pg per 100 ng genomic DNA. 
Tills results in -520 jilasiiild copies per cell. 



DISCUSSION 

We have described a new method for qua nth »t- 
infc gene copy numbers using real-time analysis 
of PCK amplifications. Real-time PCK is compat- 
ible with eJthex of the two PCK (KT-PCR) ap- 
pruaciieM (1) quantitative con»f«:titivc wl^crc an 
IntciJial coiiipellLcif for each target sequence is 
used for normaHjsaUon (data not shown) or (2) 
quantitative comparative 3 J CH usJny n nuiuirtli^a- 
tion gene conlained within the sample (i.e., (3-ac- 
tiii) ox a "housttkeeping" gene for RT-PCH. Tf 
equal amounts of nucleic ucld are anaJy/.ed for 
eacn sample aiul if the amplification efficiency 
before quantitative analysis in identical for each 
sample, the internal cunluil (nuj-mallT-aliou gvne 
or competiror) should give equal M^nals for all 
samples. 

Tlie real-time PCK method offers scvenil ad. 
vantages over tlie other two methods currently 
employed (see Ihe Introduction). I : irsl, the real- 
time IHJK method is performed in a closed-tube 
system and requires no pcnt-PUR manipulation 
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1*0 (rtft kn|i« rv^nnle UK|*) 

y ! DU ^t^? ,fin i l ? tK/0 of P F8TM in ^Mdccicd cell*. M) Amount of 

r^f ' n ? c . Unn ^ to » Pitted against u«e m«„ c/vKd2e? 
n fi U , ^malning h r ailcr i ronsfcc „on. «Q Standard curve* of 

d 5u EirfliTi E r pCcUv ^* f' r *™ ON A (fl) and genomic. DNA (Q were 
t^ndiZul I 0 ^pWcMlon with the oppropnW primer,. The (Uclln 

(O) The amount of P F8TM present per 1 00 ng of genomic DNA. 



of sample. Therefore, lht< potential for TCR con- 
lamination in the laboratory is reduced because 
amplified products cam l><» »u»|yy.od and disponed 
of without opening thi> reaction tubes. Second, 
this method suppoils die umi of a nonniillxiiUof] 
gene (Lc,, P-octin) for quantitative. PGR or house- 
keeping genes for quantitative RT-l'CU controls. 
Analysis is performed in real time during the Jog 
phase of product accumulation. Analysis during 
)u* phase permits many different genes (over a 
wide input target range) to be analysed simulta- 
neously, without concern of reaching reaction 
plateau at different cycle*, Tliis will make inulll- 
gene analysis assays much caMct to develop, be- 
cause individual in tenia J competitor will m*i be 
needed for each gene under analysis- TJiird, 
sample throughput will hicieajvc dr uinalicdiiy 
with the new method because there is no post- 
PCJR procensing time. Additionally, walking In a 
90-wcll format is highly compatible, with auto, 
illation technology. 

The reaJ-tiiiR' 1>CR method is highly repro. 
ducible. Replicate amplifications can be analysed 



for c-nch sample ittinimlidng potcntiol <?rror. The, 
sysium allows i C u a very large assay dynamic 
runge (uppToatthing 1,000,000 -fold starting Uti. 
got). Wring u Mandard curve for the target oJ in- 
terest, relative copy number values can be deter- 
mined for any unknown sample. fluorescent 
threshold vnJvics, O r , contdate linearly with rela- 
tjve PNA copy numbers. Real time quantitative 
KT-PCR methodology (Gibson et aL, this l/wuc,) 
has aliobcftil developed. Finally, real Urn* quan- 
titative I'CR methodology can be used to develop 
high-throughput screening assays for n variety of 
appjications [quantitative gene CAj>iessiuii (RT- 
PCR), n cne copy assays (Hcrfc, II1V, etc.), £cm> 
typing (knockout mouse analysis), and Jnimuiio- 

pciy. 

Real-time PCA\ may al.w l>c j>crformed using 
intercalating dyes (Hlgtichi cl al. lfWJ such us 
eihJdium bromide. The fluorogenic prone, 
method offers a major advantage over inter- 
calating dyes- greater specificity (i.e., primer 
dlmers and nonspecific PCR products are not de.- 
tf\eted). 
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METHODS 

Generation of «< Plasmid Containing a Portia! 
cDNA For Human Factor Vlll 

Total RNA w<t> harveatcd (UNA*»I 1> from T*l Tc«, Inc., 
J-nendSYVOOd, TX) froxu lxOIj* ii^iutTccled wtin a factor Vlll 
rxj>rt»sJuti v«lor, pC:iS2.Uc?.iUJ (Katon el id. 1«K0; Gor. 
mnn ct at. 1900V A factor VIII partial cDNA wpil'THV W«S 
^•ncimUd by HX 1>C!!< ICionoAmp |(Z mil RNA PHI Kit 

(pan N8UK-0179, l'£ Applied UiOSysiCittS, rostvt City, tIA)] 
using the »'C:u priuivrs Wfor w«d ttuxv (prim if sequences 
are shown below). The ampHcon was reamplifird dslnR 
modified i-nfor and Rrcv primers (npjxiulrt whh hmniU 
and ftfirdUJ restriction 5<rc sequences at the V ami 
clonal Ifilo ptiKM- 3Z (Promt^u CU>rp. f Madison, WI).The 
resulting clonp, pPSTM. was used lor transient transf colon 
of £93 cell*; 



Amplification of Target DNA anil Duiccilon of 
Amplicon Factor VIII Masmid DNA 

(pr'BTM) was amplified with the pimei* IWor y^'X'Xl- 
C?TCK;<^\AUAU:iXjAtX!lCn'CV-3 J and l J »rfv .V-AAACCT- 
l^CCXrrCKiAJXiCjTAOC-a'.HiB rvavllon pi od nerd w 47.2- 
up i*C:k product. The forward primer was -debited tu tei* 
ognlzc u unique Mipiviur fwund In the 5' untranslated 
region of tliu paitriil pC152.o\25l> ploanikl utiil therefore 
does nut k'uikhUv: «uid amplify the human factor VIII 
gene* I'riinnrfi woro chosen willi the avsivtaurc of lhc> com- 
puter program Oligo 1,0 (Nalimnil lliuaeionees, lnv„ Mly» 
mouth, MN). The human p-actm gene was amplified with 
tlic primer* f*w-ein forward primer .A , .TCACCCAt!AC.TC !T 
GCCCATCrAflOA-.r anU p-actiu reverse- piirner V-<!Af 
CGGAACCCc:iKM'm»c:c'JiA'J a CG-3 v . The reaction pro- 
duced a 295* hp t'CJt product. 

Amplification reactions (SO (jj) louiainwl a DNA 
sample, )0x \KA\ lluffcr II (S u.1), 200 jam UA1T, dCTP, 
dOTP, and 400 p.M rfHTI\ 4 \v\u Mg<:i ? , I.5LS Units Ampil 
rm) r;NA poiyrnciasc, 0.5 unit Amprarnsc uracil rt-j;iy- 
wwyluav <UNG), SOpmolv of each faciei Vlll prlinvt, and 1£ 
pinole of iwolt || M-.tlrt pi liner. The leaftliiiwt alto i:onitdncd 
OUO Of the following iHecMUm prnhos (WMJ hm rnrli): 

j'»pr«.bf A'(VAM)Ac:cnvri'<:ru(:cri , c;frri(-riT(:rc; , r. 

GCCTT(TAMRA)p 3' «ud p-netin prwbc 5' (TAM)AT(JU:c;- 
XCrAMKA)CC.CCCATCC:CATC|>.3 1 where p indicates 
pho^phorylaiinn nnd X Indicates a linker arm nucleotide. 

Reaction lube* wrw Mit:n)An\p Optical Tulx-S (part Hum- 
1.kt NW)1 OO.I.I, Pcridn Ulnier) tliai wore fro cUh.1 (at IVrkln 
Hlnicr) tu prcvc-nl lighJ from reflecting. ')\ibc copi were 
similar in MicitiAiiip Cinjla bul specially designed to pre* 
yent HkIu svatlvnng-All nl 1 1 Vl'M t^ffiAutniihlv« wero »vi>w 
I'licd l*y Plv Applied lUocyfitrliis (|^>*U*r Titty, CA) cxccpl 
thr factor Vlll prltuera, wliieh weu» synthcisl/rd at Ceneu 
tccli, Inc. (South f.mi rmnelsco, CA), Prohe-v desjgnt»d 
UAhig the Oliyr.» 4.0 software, follvwlnp gijldelliif-* ai-^- 
j^cMeci in uic Model 7700 .sequence IH'ieaor liLitutmnil 
manual. Htlofly, probe T m sJitmld he At least 5 U C higher 
mail the aimeulluii leiuyvmlurt: u.>cd during ifit-rmal cy- 
rluig; primers shouHl not /unn hlfthlv duplexes* wilh Ihc- 
probe. 

The therm*! i-ycling cuiiUilloiu Included 2 niln al 
50"C and 10 nmi al 9S m C. Ilicaiiial eycliiiji; proreedrd with 
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reactions were pcrfonned in tht» Morlrl 7700, Sequence IX*- 
Imtor (PE Applied Ulusyvluuiv), wlilrh conlalns vi Cd>e- 
Anip l'<":ll Sy&tvm P0OO. U«:a«:llon conditions w«tt. pvn- 
{^rutlliucU on .i I>ww9r Mncinti»h V100 (Apple XV.unpnfpr, 
Santa Clara, t^\) linked diiwily to the Model V?f»0 
cjuenev IXiUctor* An»"y*U of data w*« alvo ]if*rformf>H nn 
live Mnclnir>*h computer. CVilloetlou and analyst to f l ware 
way dcvelojwd at W. Apt)lled HlosyMuius. 

Tramfection of Cells with Faetor VIII Coiwtruct 

J-'nur T17.1 Oasks of 293 cells (ATCC ClRl. 1S7H), a human 
fetal kidney stibppOAion cell line, were H ^lw,, l " con- 
lUiency and tranifetled plWM. Cells were fjrown In the 
Mllowltig mcdlfts S0% HAM'S Hl2 without GUT, 5<Mf> InwJ 
glucose PuJ hereof modified Kaj^le medium (i)MKM) wlth« 
out Rlyrine willi sodium bicarbtmate, J0% letal brivftic 
seruui, 2 him L-gtuUininr, <*nri 1% penicilHn-strcptom>^ 
^|n. The media waa d tanked 30 mln l>eA»«* the I ran sice 
lion. pi : UTM DNA amounts of ^0, 4, OS, and 0.1 h; were 
iiUdtMl it> 1.?> ml of a solution containing 0.125 m C*CA 7 
and 1 x I J WHS. Tile four mixtures were left at rt>oin ti'tu- 
pcjmt"***' (ft it) mln and UHeii udde*l drupv#Uic« to u\o cells. 
'Hie n«>K> wv«*;n«-ubalcd ol37°C and 5% CO a fnr 24 hr, 
washed with PUS, and fiwwspcndcd In PUS. The K'stiM 
jn-ndi^l ccWs were divided into aliquot a und DNA was cv« 
traeted Inunedlutely wring iheQIAamp Klrnnl Kit (Qiapen. 
Qj«t^m>rth, C«A>, l>NA w»is dwted Into 200 ^1 wl 30 
TfWlCJ olpll ».0. 
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ABSTRACT Wnt family members are critical to many 
developmental processes, and components of the Wnt signal- 
ing pathway have been linked to tumorigenesis in familial and 
sporadic colon carcinomas. Here we report the identification 
of two genes, WISP-1 and WISP-2, that are up-regulated in the 
mouse mammary epithelial cell line C57MG transformed by 
Wnt-1, but not by Wnt-4. Together with a third related gene, 
WISP-3, these proteins define a subfamily of the connective 
tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-1. These included (i) C57MG cells infected with a Wnt-1 
retroviral vector or expressing Wnt-1 under the control of a 
tetracyline repressible promoter, and (w) Wnt-1 transgenic 
mice. The WISP-1 gene was localized to human chromosome 
8q24.1-8q24J. WISP-1 genomic DNA was amplified in colon 
cancer cell lines and in human colon tumors and its RNA 
overexpressed (2- to > 30-fold) in 84% of the tumors examined 
compared with patient-matched normal mucosa. WISP»3 
mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to > 40-fold) in 63% of the colon tumors analyzed. 
In contrast, WISP-2 mapped to human chromosome 20ql2- 
20ql3 and its DNA was amplified, but RNA expression was 
reduced (2- to > 30-fold) in 79% of the tumors. These results 
suggest that the WISP genes may be downstream of Wnt-1 
signaling and that aberrant levels of WISP expression in colon 
cancer may play a role in colon tumorigenesis. 



Wnt-1 is a member of an expanding family of cysteine- rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
2). Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomas (3, 4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the seven-transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsh) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally cons titu lively active glycogen 
synthase kinase-3/3 (GSK-30) resulting in an increase in 
0-catenin levels. Stabilized 0-catenin interacts with the tran- 
scription factor TCF/Lef 1, forming a complex that appears in 
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the nucleus and binds TCF/Lefl target DNA elements to 
activate transcription (7, 8). Other experiments suggest that 
the adenomatous polyposis coli (APC) tumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
£-catenin levels (9). APC is phosphoryiated by GSK-3/3, binds 
to /3-catenin, and facilitates its degradation. Mutations in 
either APC or /3-catenin have been associated with colon 
carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated downstream components activated by 
Wnt have been characterized. Those that have been described 
cannot account for all of the diverse functions attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal-related 3 gene, Xnr3, a member of 
the transforming growth factor (TGF)-/3 superfamily, and the 
homeobox genes, engrailed, goosecoid, twin (Xtwn), and siamois 
(2). A recent report also identifies c-myc as a target gene of the 
. Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) (11), using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a 
partially transformed phenotype, characterized by elongated 
and refractile ceils that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed cells, WISP-1 
and WISP-2, and a third related gene, W1SP-3. The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Select cDNA 
. Subtraction Kit (CLONTECH). Tester double-stranded 

Abbreviations: TGF, transforming growth factor; CTGF, connective 
tissue growth factor; SSH, suppression subtractive hybridization; 
VWC, von Will eb rand factor type C module. 
Data deposition: The sequences reported in this paper have been 
deposited in the Genbank database (accession nos. AF100777 
AF1G0778, AF100779, AF100780, and AF100781). 
tTo whom reprint requests should be addressed, e-mail: diane@gene. 
com. 
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cDNA was synthesized from 2 Mg of poly(A) + RNA isolated 
from the C57MG/ Wnt-1 cell line and driver cDNA from 2 ug 
of poly(A) + RNA from the parent C57MG cells. The sub- 
tracted cDNA library was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening. Clones encoding full-length 
mouse WISP-1 were isolated by screening a AgtlO mouse 
embryo cDNA library (CLONTECH) with a 70-bp probe from 
the original partial clone 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human WISP-1 
were isolated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
coding full-length mouse and human WISP-2 were isolated by 
screening a C57MG/Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding WISPS were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA. PCR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLONTECH) and 300 pjM of each dNTP at 
94°C for 1 sec, 62°C for 30 sec, 72°C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3-phosphate dehydrogenase primer 
sequences are available on request. 

In Situ Hybridization. 33 P-labeled sense and antisense ribo- 
probes were transcribed from an 897-bp PCR product corre- 
sponding to nucleotides 601-1440 of mouse WISP-1 or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WISP-2. All tissues were processed as described (40). 

Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-amplified, and the results 
were submitted to the Stanford or Massachusetts Institute of 
Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Leeds, United Kingdom. Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: SW480, COLO 320DM, HT-29, 
WiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a variant of ATCC colon adenocarcinoma cell line LS 
174T). DNA concentration was determined by using Hoechst 
dye 33258 intercalation f luorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCl cushions or prepared by using RNAzol. 

Gene Amplification and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs and c-myc in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR. Gene-specific primers and 
fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2(a«) where ACt represents the difference in amplification 
cycles required to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with normal mucosal RNA. The 
3-method was used for calculation of the SE of the gene copy 
number or RNA expression level. The W/SP-specific signal was 
normalized to that of the glyceraldehyde-3-phosphate dehy- 
drogenase housekeeping gene. All TaqMan assay reagents 
were obtained from Perkin-Elmer Applied Biosystems. 

RESULTS 

Isolation of WISP-1 and WISP-2 by SSH. To identify Wnt- 
1-inducible genes, we used the technique of SSH using the 



mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express Wnt-1 (11). Candidate differentially ex- 
pressed cDNAs (1,384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
logues, 32% matched expressed sequence tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/Wnt-1 cells. 

Two of the cDNAs, WISP-I and WISP-2, were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. I A and B). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG cells and has no 
effect on /3-catenin levels (13, 14). Expression of WISP-1 was 
up-regulated approximately 3-fold in the C57MG/Wnt-1 cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnt-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of Wnt-1 in the 
repressed state but show a strong induction of Wni-I mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various times after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNA (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr, the 
induction of WISPs may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-1 were isolated and the 
sequence compared with mouse WISP-1. The cDNA sequences 
of mouse and human WISP-1 were 1,766 and 2,830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of * 40,000 (M T 40 K). Both have 
hydrophobic N-terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-linked glycosylation sites 
and are 84% identical (Fig. 2A), 

Full-length cDNA clones of mouse and human WISP-2 were 
1,734 and 1,293 bp in length, respectively, and encode proteins 
of 251 and 250 aa, respectively, with predicted relative molec- 
ular masses of «27,000 (M r 27 K) (Fig. IB). Mouse and human 
WISP-2 are 73% identical. Human WISP-2 has no potential 
N-linked glycosylation sites, and mouse WISP-2 has one at 
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Fig. 1. WISP-1 and WISP-2 are induced by Wnt-1, but not Wnt-4, 
expression in C57MG cells. Northern analysis of WISP-1 (A) and 
WISP-2 (B) expression in C57MG, C57MG/WnM, and C57MG/ 
Wnt-4 cells. Poly(A) + RNA (2 jig) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse WISP-1- specific probe 
(amino acids 278-300) or a 190-bp WFSP-2-sptdTic probe (nucleotides 
1438-1627) in the 3' untranslated region. Blots were rehybridized with 
human 0-actin probe. 
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Fig. Z Encoded amino acid sequence alignment of mouse and 
human WISP-1 (A) and mouse and human WISP-2 (£). The potential 
signal sequence, insulin-like growth factor-binding protein (IGF-BP), 
VWC, thrombospondin (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WISP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISP-L 

Identification of WISP-3. To search for related proteins, we 
screened expressed sequence tag (EST) datahases with the 
WISP-1 protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WISP-3 cDNA of 1,371 bp was isolated corresponding to those 
ESTs that encode a 354raa protein with a predicted molecular 
mass of 39,293. W1SP-3 has two potential N-linked glycosyU 
ation sites and 36 cysteine residues. An alignment of the three 
' human WISP proteins shows that WISP-1 and W1SP-3 are the 
most similar (42% identity), whereas WISP-2 has 37% identity 
with WISP-1 and 32% identity with WISP-3 (Fig. 3/4). 

WJSPs Are Homologous to the CTGF Family of Proteins. 
Human. WISP- 1, WISP-2, and WISP-3 are novel sequences; 
however, mouse WISP-1 is the same as the recently identified 
Elml gene. Elml is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP-2 are homologous to the recently 
described rat gene, rCop-1 (16). Significant homology (36- 
44%) was seen to the CCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the protoonco- 
gene nov. CTGF is a chemotactic and mitogen ic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-/3 (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
functional, but not sequence, . similarity to Wnt-1. All are 
secreted, cysteine-rich heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 35) (21). The N-terminal domain, which includes the first 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor' (IGF)- 
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Fig. 3. (A) Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WISP-1 and WTSP-2 that are not 
present in WISP-3 are indicated with a dot. {B) Schematic represen- 
tation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PCR was performed on human 
multiple-tissue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 

binding proteins (BP). This sequence is conserved in WISP-2 
and WISP-3, whereas WISP-1 has a glutamine in the third 
position instead of a glycine. CTGF recently has been shown 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWC), also found in certain 
collagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
oligomerization (24). The VWC domain of WISP-3 differs 
from all CCN family members described previously, in that it 
contains only six of the 10 cysteine residues (Fig. 3 A and B). 
A short variable region follows the VWC domain. The third 
module, the thrombospondin (TSP) domain is involved in 
binding to sulfated glycoconju gates and contains six cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
present in all CCN family members described to date but is 
absent in WISP-2 (Fig. 3 A and B). The existence of.a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are seemed proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown). 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PCR 



( 



( 



14720 Cell Biology, Medical Sciences: Pennica et aL 

analysis on adult and fetal multiple tissue cDNA panels. 
WISP-1 expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WISP-2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant expression of WISPS was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISPS 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In Situ Localization of WISP-1 and WISP -2. Expression of 
WISP- 1 and WISP-2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WISP-I was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D). However, low- 
level WISP-1 expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-1, WISP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H). However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 




Fig. 4. (A t C, E t and G) Representative hematoxylin/eosin-stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WISP-1 expression are shown in B and 
D. The tumor is a moderately well -differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power {A and B) t 
expression of WISP-1 is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and D), and tumor cells are negative. 
Focal expression of WISP- 1, however, was observed in tumor cells in 
some areas. Images of WISP-2 expression are shown in £-//. At low 
power (E and F),. expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher magnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and H). 
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the predominant cell type expressing WISP-I was the stromal 
fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human. WISP genes was determined 
by radiation hybrid mapping panels. WISP-1 is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.31] on chromosome 8q24.1 to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-myc (28). Preliminary fine 
mapping indicates that WISP-I is located near D8S1712 STS. 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISPS mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(lod = 1,000). WISPS is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression of WISPs in Human 
Colon Tumors. Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig^ 
nificance. For example, in a variety of tumor types, c-myc 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-I resides in the same 
general chromosomal location (8q24) as c-myc, we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-myc locus. 
Genomic DNA from human colon cancer cell lines was 
. assessed by quantitative PCR and Southern blot analysis. (Fig. 
5 A and B). Both methods detected similar degrees of WISP-I 
amplification. Most cell lines showed significant (2- to 4-fold) 
amplification, with the HT-29 and WiDr cell lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-myc gene is not part of the amplicon that 
involves the WISP-I locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA 
was compared with pooled normal DNA from 10 donors by 
quantitative PCR (Fig. 6). The copy number of WISP-I and 
WISP-2 was significantly greater than one, approximately 
2-fold for WISP-1 in about 60% of the tumors and 2- to 4-fold 
for WISP-2 in 92% of the tumors (P < 0.001 for each). The 
copy number for WISPS was indistinguishable from one (P = 
0.166). In addition, the copy number of WISP-2 was signifi- 
cantly higher than that of WISP-1 (P < 0.001). 

The levels of WISP transcripts in RNA isolated from 19 
adenocarcinomas and their matched normal mucosa were 




Fig. 5. Amplification of WISP-1 genomic DNA in colon cancer cell 
lines. (A) Amplification in cell line DNA was determined by quanti- 
tative PCR. (B) Southern blots containing genomic DNA (10 jig) 
digested with EcoRl (WISP-1) or Xbal (c-myc) were hybridized with 
a 100-bp human WISP-1 probe (amino acids 186-219) or a human 
c-myc probe (located at bp 1901-2000). The WISP and myc genes are 
detected in normal human genomic DNA after a longer film exposure. 
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Fig. 6. Genomic amplification of WS^* genes in human colon 
tumors. The relative gene copy number of the WISP genes in 25 
adenocarcinomas was assayed by quantitative PCR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means i SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PCR (Fig. 7). The level of WISP-1 
RNA present in tumor tissue varied but was significantly 
increased (2- to >25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10-fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WISP-2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP-1, WISPS RN A was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 
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Fig. 7. WISP RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient. 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PCR. The Dukes stage of the tumor is listed under the 
sample number. The data are means ± SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 



mucosa. The amount of overexpression of WISPS ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-state mRNA levels will differ between normal and 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed bv 
Wnt-1. y 

Three of the genes isolated, WISP-1, WISP-2, and WISP-3, 
are members of the CCN family of growth factors, which 
includes CTGF, Cyril, and nov, a family not previously linked 
to Wnt signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-1. 
The first was C57MG cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing Wnt-1 under the control of 
a tetracyline-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses Wnt-1, 
whereas normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISPs in that in these two situations, 
WISP induction was correlated with Wnt-1 expression. 

It is not clear whether the WISPs are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i.e., jS-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-1 -transformed cells, hours 
or days after Wnt-1 transformation. Thus, WISP expression 
could result from Wnt-1 signaling directly through /3-catenin 
transcription factor regulation or alternatively through Wnt-1 
signaling turning on a transcription factor, which in turn 
regulates WISPs. 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WISP-2 is the absence of a CT domain, 
which is present in CTGF, "Cyril, nov, WISP-1, and WISP-3. 
This domain is thought to be involved in receptor binding and 
dimertzation. Growth factors, such asTGF-0, platelet-derived 
growth factor, and nerve growth factor, which contain a cystine 
knot motif exist as dimers (32). It is tempting to speculate that 
WISP-1 and WISP-3 may exist as dimers, whereas WISP-2 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov. A recent report has shown that integrin ovfo se'rves as 
an adhesion receptor for Cyr61 (33). 

The strong expression of WISP-1 and WISP-2 in cells lying 
within the fibrovascular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of 
connective tissue stroma in mammary tumors by a cascade of 
growth factor signals similar to that controlling connective 
- tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory ceils at the tumor 
interstitial interface secrete TGF-01, which is the stimulus for 
stromal proliferation (34). TGF-01 is secreted by a large 
percentage of malignant breast tumors and may be one of the 
growth factors that stimulates the production of CTGF and 
WISPs in the stroma. 

It was of interest that WISP-1 and WISP-2 expression was 
observed in the stromal cells that surrounded the tumor cells 
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(epithelial cells) in the Wnt-1 transgenic mouse sections of 
breast tissue. This finding suggests that paracrine signaling 
could occur in which the stromal cells could supply WISP-1 and 
VISP-2 to regulate tumor eel! growth on the WISP extracel- 
lular matrix. Stromal cell-derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WISP-I 
and WISP-2 in the stromal cells of breast rumors supports this 
paracrine model. 

An analysis of WISP-1 gene amplification and expression in 
human colon tumors showed a correlation between DNA 
amplification and overexpression; whereas overexpression of 
WISP-3 RNA was seen in the absence of DNA amplification. 
In contrast, WISP-2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression, in normal 
colonic mucosa from the same patient. The gene for human 
WISP-2 was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WISP-2 may be caused by another gene in this 
amplicon. 

A recent manuscript on rCop-], the rat orthologue of 
WISP-2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP-2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis col i and /3-catenin (39). Mutations in specific regions 
of either gene can cause the stabilization and accumulation of 
cytoplasmic /3-catenin, which presumably contributes to hu- 
man carcinogenesis through the activation of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces tumorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down- 
stream of Wnt-1 in C57MG cells suggests they could be 
important mediators of Wnt-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. . 
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methods. Peptides AENK or AEQK were' dissolved in water, made isotonic with 
NaOand diluted into RPMI growth medium. T-ceU-proliferation assays were 
done essentially as described*" 1 . Briefly, after antigen pulsing (30jigmr l 
TTO) with tetrapeptides (l^mgmr 1 ), PBMCs or EBV-B cells were 
washed in PBS and fixed for 45 s in 0.05% glucaraldehyde. Glycine was added 
to a final concentration of 0.1 M and the cells were washed five times in RPMI 
1640 medium containing 1% FCS before co-culture with T-cell clones in 
round-bottom 96 -well microtitre plates. After 48 h, the cultures were pulsed 
with 1 aCi of 3 H- thymidine and harvested for scintillation counting 16 h later. 
Predigestion of native TTCF was done by incubating 200 ug TTCF with 0.25 ug 
pig kidney legumain in 500 u.1 50 mM citrate buffer, pH 5.5, for 1 h at 37 °C. 
Glycopeptide digestions. The peptides H1DNEEDI. HlDN(N-glucosamine) 
EEDI and HIDNESDI, which are based on the TTCF sequence, and 
QQQHLFGSbAO-DCSGNFCLFR(KKK), which is based on human transferrin, 
were obtained by custom synthesis. The three C- terminal lysine residues were 
added to the natural sequence to aid solubility. The transferrin glycopeptide 
QQQHLFGSNVTDCSGNFCLFR was prepared by tryptic (Promega) digestion 
of 5mg reduced, carboxy- methylated human transferrin followed by 
concanavalin A chromatography". Grycopeptides corresponding to residues 
622-642 and 421-452 were isolated by reverse-phase HPLC and identified by 
mass spectrometry and N-terminal sequencing. The lyophilized transferrin- 
derived peptides were redissolved in 50mM sodium acetate, pH 5.5, 10 mM 
dithiothreitol, 20% methanol. Digestions were performed for 3 h at 30 °C with 
5-50 mUmT 1 pig kidney legumain or B-cell AEP. Products were analysed by 
HPLC or MALDI-TOF mass spectrometry using a matrix of lOmgmT 1 o> 
cyanocinnamic acid in 50% acetonitrile/0.1% TFA and a PerSeptive Biosystems 
Elite STR mass spectrometer set to linear or reflector mode. Internal standar- 
dization was obtained with a matrix ion of 568.13 mass units. 
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Fas ligand (FasL) is produced by activated T cells and natural 
killer cells and it induces apoptosis (programmed cell death) in 
target cells through the death receptor Fas/Apol/CD95 (ref. 1). 
One important role of FasL and Fas is to mediate immune- 
cytotoxic killing of cells that are potentially harmful to the 
organism, such as virus-infected or tumour cells 1 . Here we 
report the discovery of a soluble decoy receptor, termed decoy 
receptor 3 (DcR3), that binds to FasL and inhibits FasL-induced 
apoptosis. The DcR3 gene was amplified in about half of 35 
primary lung and colon tumour* studied, and DcR3 messenger 
RNA was expressed in malignant tissue. Thus, certain tumours 
may escape FasL-dependent immune -cytotoxic attack by expres- 
sing a decoy receptor that blocks FasL 

By searching expressed sequence tag (EST) databases, we identi- 
fied a set of related ESTs that showed homology to the tumour 
necrosis factor (TNF) receptor (TNFR) gene superfamily 2 . Using 
the overlapping sequence, we isolated a previously unknown full- 
length complementary DNA from human fetal lung. We named the 
protein encoded by this cDNA decoy receptor 3 (DcR3). The cDNA 
encodes a 300-amino-acid polypeptide that resembles members of 
the TNFR family (Fig. la): the amino terminus contains a leader 
sequence, which is followed by four tandem cysteine-rich domains 
(CRDs).Like one other TNFR homologue, osteoprotegerin (OPG) 3 , 
DcR3 lacks an apparent transmembrane sequence, which indicates 
that it may be a secreted, rather than a membrane-asscociated, 
molecule. We expressed a recombinant, histidine-tagged form of 
DcR3 in mammalian cells; DcR3 was secreted into the cell culture 
medium, and migrated on polyacrylamide gels as a protein of 
relative molecular mass 35,000 (data not shown). DcR3 shares 
sequence identity in particular with OPG (31%) and TNFR2 
(29%), and has relatively less homology with Fas (17%). AH of 
the cysteines in the four CRDs of DcR3 and OPG are conserved; 
however, the carboxy- terminal portion of DcR3 is 101 residues 
shorter. 

We analysed expression of DcR3 mRNA in human tissues by 
northern blotting (Fig. lb). We detected a predominant 1.2-kilobase 
transcript in fetal lung, brain, and liver, and in. adult spleen, colon 
and lung. In addition, we observed relatively high DcR3 mRNA 
expression in the human colon carcinoma cell line SW480. 

To investigate potential ligand interactions of DcR3, we generated 
a recombinant, Fc-tagged DcR3 protein. We tested binding of 
DcR3-Fc to human 293 cells transfected. with individual TNF- 
family ligands, which are expressed as type 2 transmembrane 
proteins (these transmembrane proteins have their N termini in 
the cytosol). DcR3-Fc showed a significant increase in binding to 
cells transfected with FasL 4 (Fig. 2a), but not to cells transfected with 
TNF 5 , Apo2L/TRAIL 6 ' 7 , Apo3L/TWEAK w , or OPGL/TRANCE/ 



NATURE | VOL 396 1 1 7 DECEMBER 1998 (www.naturc.com 



Nature © Macmiilan Publishers Ltd 1998 



699 



letters to nature 



( 



( 



RANKL IM3 (data not shown). DcR3-Fc immunoprecipitated shed 
FasL from FasL- transfected 293 cells (Fig. 2b) and purified soluble 
FasL (Fig. 2c), as did the Fc-tagged ectodomain of Fas but not 
TNFR1. Gel-filtration chromatography showed that DcR3-Fc and 
soluble FasL formed a stable complex (Fig. 2d). Equilibrium 
analysis indicated that DcR3-Fc and Fas-Fc bound to soluble 
FasL with a comparable affinity (K d = 0.8 i 0.2 and 
l.l^O.lnM, respectively; Fig. 2e), and that DcR3-Fc could 
block nearly all of the binding of soluble FasL to Fas-Fc (Fig. 2e, 
inset). Thus, DcR3 competes with Fas for binding to FasL. 

To determine whether binding of DcR3 inhibits FasL activity, we 
tested the effect of DcR3-Fc on apoptosis induction by soluble 
FasL in Jurkat T leukaemia cells, which express Fas (Fig. 3a). DcR3- 
Fc and Fas-Fc blocked soluble- FasL-induced apoptosis in a 
similar dose-dependent manner, with half-maximal inhibition at 
^0.1 u,g ml" 1 . Time-course analysis showed that the inhibition did 
not merely delay cell death, but rather persisted for at least 24 hours 
(Fig. 3b). We also tested the effect of DcR3-Fc on activation- 
induced cell death (AICD) of mature T lymphocytes, a FasL- 
dependent process 1 . Consistent with previous results 13 , activation 
of interleuIdn-2 -stimulated CD4-positive T cells with anti-CD3 
antibody increased the level of apoptosis twofold, and Fas-Fc 
blocked this effect substantially (Fig. 3c); DcR3-Fc blocked the 
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Figure 1 Primary structure and expression of human DcR3. a, Alignment of the 
amino-acid sequences of DcR3 and of osteoprotegerin (OPG); the Oterminal 101 
residues of OPG are not shown. The putative signal cleavage site (arrow), the 
cysteine-rich domains (CRD 1 -4), and the /V-linked glycosylation site (asterisk) are 
shown, b, Expression of DcR3 mRNA. Northern hybridization analysis was done 
using the DcR3 cDNA as a probe and blots of pofytA)* RNA (Clontech) from 
human fetal and adult tissues or cancer cell lines. PBL peripheral blood 
lymphocyte. 



induction of apoptosis to a similar extent. Thus, DcR3 binding 
blocks apoptosis induction by FasL 

FasL-induced apoptosis is important in elimination of virus- 
infected cells and cancer cells by natural killer cells and cytotoxic T 
lymphocytes; an alternative mechanism involves perforin and 
gran2ymes , ,4 " ,b . Peripheral blood natural killer cells triggered 
marked cell death in Jurkat T leukaemia cells (Fig. 3d); DcR3-Fc 
and Fas-Fc each reduced killing of target cells from -65% to 
-30%, with half-maximal inhibition at -lp-gmT 1 ; the residual 
killing was probably mediated by the perforin/granzyme pathway. 
Thus, DcR3 binding blocks FasL-dependent natural killer cell 
activity. Higher DcR3-Fc and Fas-Fc concentrations were required 
to block natural killer cell activity compared with those required to 
block soluble FasL activity, which is consistent with the greater 
potency of membrane -associated FasL compared with soluble 
FasL 17 . 

Given the role of immune-cytotoxic cells in elimination of 
tumour cells and the fact that DcR3 can act as an inhibitor of 
FasL, we proposed that DcR3 expression might contribute to the 
ability of some tumours to escape immune-cytotoxic attack. As 
genomic amplification frequently contributes to tumorigenesis, we 
investigated whether the DcR3 gene is amplified in cancer. We 
analysed DcR3 gene -copy number by quantitative polymerase chain 
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Figure 2 Interaction of DcR3 with FasL a, 293 cells were transfected with pRK5 
vector (top) or with pRK5 encoding full-length FasL (bottom), incubated with 
DcR3-Fc (solid line, shaded area), TNFR1 -Fc (dotted line) or buffer control 
(dashed line) (the dashed and dotted lines overlap), and analysed for binding by 
FACS-. Statistical analysis showed a significant difference (P < 0.001) between the 
binding of DcR3-Fc to ceils transfected with FasL or pRK5. PE. phycoerythrin- 
labelled cells, b. 293 cells were transfected as in a and metabolically labelled, and 
cell supernatams were immunoprecipitated with Fc-tagged TNFR1, 0cR3 or Fas. 
c, Purified soluble FasL (sFasL) was immunoprecipitated with TNFR1 - Fc, DcR3- 
Fc or Fas-Fc and visualized by immunoblot with anti-Fas L antibody. sFasL was 
loaded directly for comparison in the right-hand lane. d. Flag-tagged sFasL was 
incubated with DcR3-Fc or with buffer and resolved by gel filtration; column 
fractions were analysed in an assay that detects complexes containing DcR3-Fc 
and sFasL-Flag. e, Equilibrium binding of OcR3-Fc or Fas-Fc to sFasL-Flag. 
Inset competition of DcR3-Fc with Fas-Fc for binding to sFasL-Flag. 
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reaction (PCR) lB in genomic DNA from 35 primary lung and colon 
tumours, relative to pooled genomic DNA from peripheral blood 
leukocytes (PBLs) of 10 healthy donors. Eight of 18 lung tumours 
and 9 of 17 colon tumours showed DcR3 gene amplification, 
ranging from 2- to 18-fold (Fig. 4a, b). To confirm this result, we 
analysed the colon tumour DNAs with three more, independent sets 
of DcR3-based PGR primers and probes; we observed nearly the 
same amplification (data not shown). 

We then analysed DcR3 mRNA expression in primary tumour 
tissue sections by in situ hybridization. We detected DcR3 expres- 
sion in 6 out of 1 5 lung tumours, 2 out of 2 colon tumours, 2 out of 5 
breast tumours, and 1 out of 1 gastric tumour (data not shown), A 
section through a squamous-cell carcinoma of the lung is shown in 
Fig. 4c.DcR3 mRNA was localized to infiltrating malignant epithe- 
lium, but was essentially absent from adjacent stroma, indicating 
tumour-specific expression. Although the individual tumour speci- 
mens that we analysed for mRNA expression and gene amplification 
were different, the in situ hybridization results are consistent with 
the finding that the DcR3 gene: is amplified frequently in tumours. 
SW480 colon carcinoma cells, which showed abundant DcR3 
mRNA expression (Fig. lb), also had marked DcR3 gene amplifica- 
tion, as shown by quantitative PCR (fourfold) and by Southern blot 
hybridization (fivefold) (data not shown). 

If DcR3 amplification in cancer is functionally relevant, then 
DcR3 should be amplified more than neighbouring genomic 
regions that are not important for tumour survival. To test this, 



we mapped the human DcR3 gene by radiation-hybrid analysis; 
DcR3 showed linkage to marker AFM2 18xe7 (T160), which maps to 
chromosome position 20ql3. Next, we isolated from a bacterial 
artificial chromosome (BAC) library a human genomic clone that 
carries DcR3, and sequenced the ends of the clone's insert: We then 
determined, from the nine colon tumours that showed twofold or 
greater amplification of DcR3, the copy number of the DcR3- 
flanking sequences (reverse and forward) from the BAC, and of 
seven genomic markers that span chromosome 20 (Fig. 4d). The 
DcR3 -linked reverse marker showed an average amplification of 
roughly threefold, slightly less than the approximately fourfold 
amplification of DcR3; the other markers showed little or no 
amplification. These data indicate that DcR3 may be at the 'epi- 
centre' of a distal chromosome 20 region that is amplified in colon 
cancer, consistent with the possibility that DcR3 amplification 
promotes tumour survival. 

Our results show that DcR3 binds specifically to FasL and inhibits 
FasL activity. We did not detect DcR3 binding to several other TNF- 
ligand-family members; however, this does not rule out the possi- 
bility that DcR3 interacts with other ligands, as do some other 
TNFR family members, including OPG 219 . 

FasL is important in regulating the immune response; however, 
little is known about how FasL function is controlled. One mechan- 
ism involves the molecule cFLIP, which modulates apoptosis signal- 
ling downstream of Fas 20 . A second mechanism involves proteolytic 
shedding of FasL from the cell surface 17 . DcR3 competes with Fas for 
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Figure 3 Inhibition of FasL activity by DcR3. a, Human Jurkat T leukaemia cells 
were incubated with Flag-tagged soluble FasL (sFasUSngml"') oligomerized 
with anti-Flag antibody (0.1 u.g ml" 1 ) in the presence of the proposed inhibitors 
DcR3-Fc, Fas-Fc or human IgGl and assayed for apoptosis (mean ± s.e.m. of 
triplicates), b. Jurkat cells were incubated with sFasL-Flag.plus anti-Flag antibody 
as in a, in presence of 1 u.gmr' DcR3-Fc (filled circles). Fas-Fc (open circles) or 
human IgGl (triangles), and apoptosis was determined at the indicated time 
points, c, Peripheral blood T cells were stimulated with PHA and interieukin-2, 
followed by control (white bars) or anti-CD3 antibody (rilled bars), together with 
phosphate-buffered saline (P8S), human IgGl, Fas-Fc. or'DcRS-FcOOn.gmr 1 ). 
After 16 h. apoptosis of CD4* cells was determined (mean ± s.e.m. of results from 
five donors), d. Peripheral blood natural killer cells were incubated with s, Cr- 
labelled Jurkat cells in the presence of DcR3-Fc (filled circles), Fas-Fc (open 
circles) or human IgGl (triangles), and target-cell death was determined by 
release of 5, Cr (mean ± s.d. for two donors, each in triplicate).- 



Figure 4 Genomic amplification of DcR3 in tumours, a, Lung cancers, comprising 
eight adenocarcinomas (c. d. f. g. h, j, k, r). seven squamous-cell carcinomas (a. e, 
m, n. o, p, q). one non-small-cell carcinoma (b), one small-ceil carcinoma (i), and 
one bronchial adenocarcinoma (I). The data are means z s.d. of 2 experiments 
done in duplicate, b, Colon tumours, comprising 17 adenocarcinomas. Data are 
means i s.e.m. of five experiments done in duplicate.- c, In situ hybridization 
analysis of DcR3 mRNA expression in a squamous-cell carcinoma of the lung. A 
representative bright-field image (left) and the corresponding dark-field image 
(right) show DcR3 mRNA over infiltrating malignant epithelium (arrowheads). 
Adjacent non-malignant stroma (S), blood vessel (V) and necrotic tumour tissue 
(N) are also shown, d. Average amplification of DcR3 compared with amplifica- 
tion of neighbouring genomic regions (reverse and forward, Rev and Fwd), the 
DcR3-linked marker T160. and other chromosome-20 markers, in the nine colon 
tumours showing DcR3 amplification of twofold or more (b). Data are from two 
experiments done in duplicate. Asterisk indicates P < 0.01 for a Student's f-test 
comparing each marker with DcR3. 
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FasL binding; hence, it may represent a third mechanism of 
extracellular regulation of FasL activity. A decoy receptor that 
modulates the function of the cytokine interleukin-1 has been 
described 21 . In addition, two decoy receptors that belong to the 
TNFR family, DcRl and DcR2, regulate the FasL-related apoptosis- 
indudng molecule Apo2L 22 . Unlike DcRl and DcR2, which are 
membrane- associated proteins, DcR3 is di reedy secreted into the 
extracellular space. One other secreted TNFR-family member is 
OPG 3 , which shares greater sequence homology with DcR3 (31%) 
than do DcRl (17%). or DcR2 (19%); OPG functions as a third 
decoy for Apo2L 19 . Thus, DcR3 and OPG define a new subset of 
TNFR-family members that function as secreted decoys to mod- 
ulate ligands that induce apoptosis. Pox viruses produce soluble 
TNFR homologues that neutralize specific TNF-family ligands, 
thereby modulating the antiviral immune response 2 . Our results 
indicate that a similar mechanism, namely, production of a soluble 
decoy receptor for FasL, may contribute to immune evasion by 
certain tumours. Q 

Methods 

Isolation of DcR3 cONA. Several overlapping ESTs in GenBank (accession 
numbers AA025672, AA025673 and W67560) and in Lifeseq™ (Incyte 
Pharmaceuticals; accession numbers 1339238, 1533571, 1533650, 1542861, 
1789372 and 2207027) showed similarity to members of the TNFR family. We 
screened human cDNA libraries by PCR with primers based on the region of 
EST consensus; fetal lung was positive for a product of the expected size. By 
hybridization to a PCR- genera ted probe based on the ESTs, one positive clone 
(DNA30942) was identified. When searching for potential alternatively spliced 
forms of DcR3 that might encode a transmembrane protein, we isolated 50 
more clones; the coding regions of these clones were identical in size to that of 
the initial clone (data not shown). 

Fc-fusion proteins (immunoadhesins). The entire DcR3 sequence, or the 
ectodomain of Fas or TNFR1, was fused to the hinge and Fc region of human 
IgGl, expressed in insect SF9 cells or in human 293 cells, and purified as 
described". 

Fluorescence-activated cell sorting (FACS) analysis. We transfected 293 
cells using calcium phosphate or Effectene (Qiagen) with pRK5 vector or pRK5 
encoding full-length human FasL 4 (2 u.g), together with pRKS encoding CrmA 
(2u,g) to prevent cell death. After 16 h, the cells were incubated with 
biotinylated DcR3-F.c or TNFRl-Fc and then with phycoerythrin-conjugated 
streptavidin (GibcoBRL), and were assayed by FACS. The data were analysed by 
Kolmogorov-Smimov statistical analysis. There was some detectable staining 
of vector-transfected cells by DcR3-Fc; as these cells express little FasL (data 
not shown), it is possible that DcR3 recognized some other factor that is 
expressed constitutively on 293 cells. 

I rrtmu no precipitation. Human 293 cells were transfected as above, and 
metabolically labelled with [ J5 S]cysteine and ( 35 S) methionine (0.5 mCi; 
Amersham). After 16h of culture in the presence of z-VAD-fmk (10u.M), 
the medium was immunoprecipitated with DcR3-Fc, Fas-Fc or TNFRl-Fc 
(5u.g), followed by protein A-Sepharose (Repligen). The precipitates were 
resolved by SDS-PAGE and visualized on a phosphorimager (Fuji BAS2000). 
Alternatively, purified. Flag- tagged soluble FasL (1 ixg) (Alexis) was incubated 
with each Fc-fusion protein (1 u.g), precipitated with protein A-Sepharose, 
resolved by SDS-PAGE and visualized by immunoblotting with rabbit anti- 
FasL antibody (Oncogene Research). 

Analysis of complex formation. Flag-tagged soluble FasL (25p.g) was 
incubated with buffer or with DcR3-Fc (40 u.g) for 1.5 h at 24 °C. The reaction 
was loaded onto a Superdex 200 HR 10/30 column (Pharmacia) and developed 
with PBS; 0.6-ml fractions were collected. The presence of DcR3-Fc-FasL 
complex in each fraction was analysed by placing 100 u.1 aliquots into microti tre 
wells precoated with anti-human IgG (Boehringer) to capture DcR3-Fc, 
followed by detection with biotinylated anti-Flag antibody Bio M2 (Kodak) and 
streptavidin-horseradish peroxidase (Amersham). Calibration of the column 
indicated an apparent relative molecular mass of the complex of 420K (data not 
shown), which is consistent with a stoichiometry of two DcR3-Fc homddimers 
to two soluble FasL homotrimers. 

Equilibrium binding analysis. Microtitre wells were coated with anti-human 



IgG, blocked with 2% BSA in PBS. DcR3-Fc or Fas-Fc was added, followed by 
serially diluted Flag-tagged soluble FasL. Bound ligand was detected with anti- 
Flag antibody as above. In the competition assay, Fas-Fc was immobilized as 
above; and the wells were blocked with excess IgGl before addition of Flag- 
tagged soluble FasL plus DcR3-Fc. 

T-cell AICD. CD3* lymphocytes were isolated from peripheral blood of 
individual donors using anti-CD3 magnetic beads (Miltenyi Biotech), 
stimulated with phytohaemagglutinin (PHA; 2 u.g mT 1 ) for 24 h, and cultured 
in the presence of interleukin-2 ( 100 U ml" ') for 5 days. The cells were plated in 
wells coated with anti-CD3 antibody (Pharmingen) and analysed for apoptosis 
16 h later.by FACS analysis of annexin-V-binding of CD4* cells 24 . 
Natural killer cell activity. Natural killer cells were isolated from peripheral 
blood of individual donors using anti-CD56 magnetic beads (Miltenyi 
Biotech), and incubated for 16 h with s, Cr-loaded Jurkat cells at an effector- 
to-target ratio of 1:1 in the presence of DcR3-Fc, Fas-Fc or human IgGl. 
Target-cell death was determined by release of 5l Cr in effector- target co- 
cultures relative to release of 5l Cr by detergent lysis of equal numbers of Jurkat 
cells. y 

Gene-amplification analysis. Surgical specimens were provided by J. Kern 
(lung tumours) and P. Quirke (colon tumours). Genomic DNA was extracted 
(Qiagen) and the concentration was determined using Hoechst dye 33258 
intercalation fluorometry. Amplification was determined by quantitative PCR" 
using aTaqMan instrument (ABI). The method was validated by comparison of 
PCR and Southern hybridization data for the Myc and HER-2 oncogenes (data 
not shown). Gene-specific primers and fluorogenic probes were designed on 
the basis of the sequence of DcR3 or of nearby regions identified on a BAC 
carrying the human DcR3 gene; alternatively, primers and probes were based 
on Stanford Human Genome Center marker AFM218xe7 (T160), which is 
linked to DcR3 (likelihood score = 5.4), SHGC-36268 (T159), the nearest 
available marker which maps to -500 kilobases from T160, and five extra 
markers that span chromosome 20. The DcR3 -specific primer sequences were 
5'-CTTCTTCGCGCACGCTG-3' and 5'-ATCACGCCGGCACCAG-3' and the 
fluorogenic probe sequence was 5'-(FAM-ACACGATGCGTGCTCCAAGCAG 
AAp-(TAMARA), where FAM is 5 '-fluorescein phosphoramiditc. Relative 
gene-copy numbers were derived using the formula 2 {ACT1 , where ACT is the 
difference in amplification cycles required to detect DcR3 in peripheral blood 
lymphocyte DNA compared to test DNA. 
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ABC transporters (also known as traffic ATPases) form a large 
family of proteins responsible for the translocation of a variety 
of compounds across membranes of both prokaryotes and 
eukaryotes'. The recently completed Escherichia coli genome 
sequence revealed that the largest family of paralogous E coli 
proteins is composed of ABC transporters 2 . Many eukaryotic 
proteins of medical significance belong to this family, such as 
the cystic fibrosis transmembrane conductance regulator (CFTR), 
the P-glycoprotein (or multidrug-resistance protein) and the 
heterodimeric transporter associated- with antigen processing 
(Tapl-Tap2). Here we report the crystal structure at 1.5 A resolu- 
tion of HisE; the ATP-binding subunit of the histidine permease, 
which is an ABC transporter from Salmonella typhimurium. We 
correlate the details of this structure with the biochemical, genetic 
and biophysical properties of the wild-type and several mutant 
HisP proteins. The structure provides a basis for understanding 
properties of ABC transporters and of defective CFTR proteins. 

ABC transporters contain four structural domains: two nudeo- 
tide-binding domains (NBDs), which are highly conserved 
throughout the family, and two transmembrane domains'. In 
prokaryotes these domains are often separate subunits which are 
assembled into a membrane-bound complex; in eukaryotes the 
domains are generally fused into a single polypeptide chain. The 
periplasmic histidine permease of S. typhimurium and E. coli {J ~* is a 
well-characterized ABC transporter that is a good model for this 
superfamily. It consists of a membrane-bound complex, HisQMP 2 , 
which comprises integral membrane subunits, HisQ and HisM, and 
two copies of HisP, the ATP-binding subunit. HisP, which has 
•properties intermediate between those of integral and peripheral 
membrane proteins 9 , is accessible from both sides of the membrane, 
presumably by its interaction with HisQ and HisM 6 . The two HisP 
subunits form a dimer, as shown by their cooperativity in ATP 
hydrolysis 5 , the requirement for both subunits to be present for 
activity*, and the formation of a HisP dimer upon chemical cross- . 
linking. Soluble HisP also forms a dimer 3 . HisP has been purified 
and characterized in an active, soluble form 3 which can be recon- 
stituted into a fully active membrane- bound complex*. 

The overall shape of the crystal structure of the HisP monomer is 
that of an 'L' with two thick arms (arm I and arm II); the ATP- 
binding pocket is near the end of arm I (Fig. 1). A six-stranded 0- 
sheet (03 and £8-012) spans both arms of the L, with a domain of a 
a- plus 0-type structure (01, 02, 04-07, al and ct2) on one side 
(within arm I) and a domain of mostiy a- helices (a3-ct9) on the 
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figure 1 Crystal structure of HisP. a. View of the dimer along an axis 
perpendicular to its two-fold axis. The top and bottom of the dimer are suggested 
to face towards the periplasmic and cytoplasmic sides, respectively (see text). 
The thickness of arm II is about 25 A. comparable to that of membrane. a-Helices 
are shown in orange and p-sheets in green, b, View along the two-fold axis of the 
HisP dimer, showing the relative displacement of the monomers not apparent in 
a. The 0-strands at the dimer interface are labelled, c, View of one monomer from 
the bottom of arm I, as shown in a. towards arm II, showing the ATP-binding 
pocket a-c, The protein and the bound ATP are in 'ribbon' and ^aH-and-stick' 
representations, respectively. Key residues discussed in the text are indicated in 
c. These figures were prepared with MOLSCRIPT 0 . N. amino terminus; C, C 
terminus. 
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Gene amplification is a common event in the progression of 
human cancers, and amplified oncogenes have been shown to 
have diagnostic, prognostic and therapeutic relevance. A 
kinetic quantitative polymerase-chain-reaction (PCR) method, 
based on fluorescent TaqMan methodology and a new instru- 
ment (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real-time, was used to quantify 
gene amplification in tumor ON A. Reactions are character- 
ized by the point during cycling when PCR amplification is still 
in the exponential phase, rather than the amount of PCR 
product accumulated after a fixed number of cycles. None of 
the. reaction components is limited during the exponential 
phase, meaning that values are highly reproducible in reac- 
tions starting with the same copy number. This greatly 
improves the precision of ONA quantification. Moreover, 
real-time PCR does not require post-PCR sample handling, 
thereby preventing potential PCR-product carry-over con- 
tamination; it possesses a wide dynamic range of quantifica- 
tion and results in much faster and higher sample throughput. 
The real-time PCR method, was used to develop and validate 
a simple and rapid assay for the detection and quantification 
of the 3 most frequently amplified genes (myc, ccndl and 
erbBZ) in breast tumors. Extra copies of myc, ccndl and erbB2 
were observed in 10, 23 and 15%, respectively, of 108 breast- 
tumor DNA; the largest observed numbers of gene copies 
were 4.6, 18.6 and 15.1, respectively. These results correlated 
well with those of Southern blotting. The use of this new 
semi-automated technique will make molecular analysis of 
human cancers simpler and more reliable, and should find 
broad applications in clinical and research settings. Int. J. 
Cancer 78:661 -666, 1 998. 
© 1998 Wiley-Liss, Inc. 

Gene amplification plays an important role in the pathogenesis 
of various solid tumors, including breast cancer, probably because 
over-expression of the amplified target genes confers a selective 
advantage. The first technique used to detect genomic amplification 
was cytogenetic analysis. Amplification of several chromosome 
regions, visualized either as extrachromosomal double minutes 
(dmins) or as integrated homogeneously staining regions (HSRs), 
are among the main visible cytogenetic abnormalities in breast 
tumors. Other techniques such as comparative genomic hybridiza- 
tion (CGH) (Kallioniemi et a!., 1994) have also been used in broad 
searches for regions of increased DNA copy numbers in tumor 
cells, and have revealed some 20 amplified chromosome regions in 
breast tumors. Positional cloning efforts are underway to identify 
the critical gene(s) in each amplified region. To date, genes known 
to be amplified frequently in breast cancers include myc (8q24), 
ccndl ( 1 1 q 1 3), and erbBl ( 1 7q 1 2-q2 1 ) (for review, see Bieche and 
Lidereau, 1995). 

Amplification of the myc, cend), and erbB2 proto-oncogenes 
should have clinical relevance in breast cancer, since independent 
studies have shown that these alterations can be used to identify 
sub-populations with a worse prognosis (Bems et ai, 1992; 
Schuuring et ai, 1992; Stamon et ai, 1987). Muss et al, (1994) 
suggested that these gene alterations may also be usefui for the 
prediction and assessment of the efficacy of adjuvant chemotherapy 
and hormone therapy. 

However, published results diverge both in terms of the fre- 
quency of these alterations and their clinical value. For instance,/ 
over 500 studies in 10 years have failed to resolve the controversy 



surrounding the link suggested by Slamon et ai (1987) between 
erbBl amplification and disease progression. These discrepancies 
are partly due to the clinical, histological and ethnic heterogeneity 
of breast cancer, but technical considerations are also probably 
involved. 

Specific genes (DNA) were initially quantified in tumor cells by 
means of blotting procedures such as Southern and slot blotting. 
These batch techniques require large amounts of DNA (5-10 
pig/reaction) to yield reliable quantitative results. Furthermore, 
meticulous care is required at all stages of the procedures to 
generate blots of sufficient quality for reliable dosage analysis. 
Recently, PCR has proven to be a powerful tool for quantitative 
DNA analysis, especially with minimal starting quantities of tumor 
samples (small, early-stage tumors and formalin-fixed, paraffin- 
embedded tissues). 

Quantitative PCR can be performed by evaluating the amount of 
product either after a given number of cycles (end-point quantita- 
tive PCR) or after a varying number of cycles during the 
exponential phase (kinetic quantitative PCR). In the first case, an 
internal standard distinct from the target molecule is required to 
ascertain PCR efficiency. The method is relatively easy but implies 
generating, quantifying and storing an internal standard for each 
gene studied. Nevertheless, it is the most frequently applied 
method to date. 

One of the major advantages of the kinetic method is its rapidity 
in quantifying a new gene, since no internal standard is required (an 
external standard curve is sufficient). Moreover, the kinetic method 
has a wide dynamic range (at least 5 orders of magnitude), giving 
an accurate value for samples difTering in their copy number. 
Unfortunately, the method is cumbersome and has therefore been 
rarely used. It involves aliquot sampling of each assay mix at 
regular intervals and quantifying, for each aliquot, the amplifica- 
tion product. Interest in the kinetic method has been stimulated by a 
novel approach using fluorescent TaqMan methodology and a new 
instrument (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real time (Gibson et ai, 1996; Heid et 
ai, 1996). The TaqMan reaction is based on the 5' nuclease assay 
first described by Holland et ai (1991). The latter uses the 5/ 
nuclease activity of Taq polymerase to cleave a specific fluorogenic 
oligonucleotide probe during the extension phase of PCR. The 
approach uses dual-labeled fluorogenic hybridization probes (Lee 
et ai, 1993). One fluorescent dye, co-valently linked to the 5' end 
of the oligonucleotide, serves as a reporter [FAM (i.e., 6-carboxy- 
fluorescein)] and its emission spectrum is quenched by a second 
fluorescent dye, TAMRA (i.e., 6-carboxy-tetramethyl-rhodamine) 
attached to the 3' end. During the extension phase of the PCR 
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cycle, the fluorescent hybridization probe is hydrolyzed by the 
S'-Y micleolytic activity of DNA polymerase. Nuclease degrada- 
tion of the probe releases the quenching of FAM fluorescence 
emission, resulting in an increase in peak fluorescence emission. 
The fluorescence signal is normalized by dividing the emission 
intensity of the reporter dye (FAM) by the emission intensity of a 
reference dye (i.e.. ROX, 6-carboxy-X-rhodamine) included in 
TaqMan buffer, to obtain a ratio defined as the Rn (normalized 
reporter) for a given reaction tube. The use of a sequence detector 
enables the fluorescence spectra of all 96 wells of the thermal 
cycler to be measured continuously during PCR amplification. 

The real-time PCR method offers several advantages over other 
current quantitative PCR methods (Celi et ai, 1994): (i) the 
probe-based homogeneous assay provides a real -time method for 
detecting only specific amplification products, since specific hybri- 
dation of both the primers and the probe is necessary to generate a 
signal; (ii) the C, (threshold cycle) value used for quantification is 
measured when PCR amplification is still in the log phase of PCR 
product accumulation. This is the main reason why C, is a more 
reliable measure of the starting copy number than are end-point 
measurements, in which a slight difference in a limiting component 
can have a drastic effect on the amount of product; (Hi) use of C, 
values gives a wider dynamic range (at least 5 orders of magni- 
tude), reducing the need for serial dilution; (iv) The real-time PCR 
method is run in a closed-tube system and requires no post-PCR 
sample handling, thus avoiding potential contamination; (v) the 
system is highly automated, since the instrument continuously 
measures fluorescence in all 96 wells of the thermal cycler during 
PCR amplification and the corresponding software processes, and 
analyzes the fluorescence data; fvi) the assay is rapid, as results are 
available just one minute after thermal cycling is complete; (vii) the 
sample throughput of the method is high, since 96 reactions can be 
analyzed in 2 hr. 

Here, we applied this semi-automated procedure to determine 
the copy numbers of the 3 most frequently amplified genes in breast 
tumors (myc, ccndJ and erbB2), as well as 2 genes (alb and app) 
located in a chromosome region in which no genetic changes have 
been observed in breast tumors. The results for 108 breast tumors 
were compared with previous Southern-blot data for the same 
samples. 



MATERIAL AND METHODS 
Tumor and blood samples 

Samples were obtained from 1 08 primary breast tumors removed 
surgically from patients at the Centre Rene Huguenin; none of the 
patients had undergone radiotherapy or chemotherapy. Immedi- 
ately after surgery, the rumor samples were placed in liquid 
nitrogen until extraction of high-molecular-weight DNA. Patients 
were included in this study, if the tumor sample used for DNA 
preparation contained more than 60% of tumor cells (histological 
analysis). A blood sample was also taken from 18 of the same 
patients. 

DNA was extracted from tumor tissue and blood leukocytes 
according to standard methods. 

Real-time PCR 

Theoretical basis. Reactions are characterized by the point 
during cycling when amplification of the PCR product is first 
detected, rather than by the amount of PCR product accumulated 
after a fixed number of cycles. The higher the starting copy number 
of the genomic DNA target, the earlier a significant increase in 
fluorescence is observed. The parameter Q (threshold cycle) is 
defined as the fractional cycle number at which the fluorescence 
generated by cleavage of the probe passes a fixed threshold above 
baseline. The target gene copy number in unknown samples is 
quantified by measuring Q and by using a standard curve to 
determine the starting copy number. The precise amount of 
genomic DNA (based on optical density) and its quality (i.e., lack 



of extensive degradation) are both difficult to assess. We therefore 
also quantified a control gene (alb) mapping to chromosome region 
4qll-ql3. in which no genetic alterations have been found in 
breast-tumor DNA by means of CGH (Kallioniemi et at, 1994). 

Thus, the ratio of the copy number of the target gene to the copy 
number of the alb gene normalizes the amount and quality of 
genomic DNA. The ratio defining the level of amplification is 
termed "N'\ and is determined as follows: 

copy number of target gene (app. myc, ccndl, erbB2) 

N = ■■ ■ . 

copy number of reference gene {alb) 

Primers, probes, reference human genomic DNA and PCR 
consumables. Primers and probes were chosen with the assistance 
of the computer programs Oligo 4.0 (National Biosciences, Ply- 
mouth, MN), EuGene (Daniben Systems, Cincinnati, OH) and Primer 
Express (Perkin-ElmeT Applied Biosystems, Foster City, CA). 

Primers were purchased from DNAgency (Malvern, PA) and 
probes from Perkin-Elmer Applied Biosystems. 

Nucleotide sequences for the oligonucleotide hybridization 
probes and primers are available on request. 

The TaqMan PCR Core reagent kit, MicroAmp optical tubes, 
and MicroAmp caps were from Perkin-Elmer Applied Biosystems, 

Standatd-curve construction. The kinetic method requires a 
standard curve. The latter was constructed with serial dilutions of 
specific PCR products, according to Piatak et ai (1993). In 
practice, each specific PCR product was obtained by amplifying 20 
rig of a standard human genomic DNA (Boehringer, Mannheim, 
Germany) with the same primer pairs as those used later for 
real-time quantitative PCR. The 5 PCR products were purified 
using MicroSpin S-400 HR columns (Pharmacia, Uppsala, Swe- 
den) electrophorezed through an acrylamide gel and stained with 
ethidium bromide to check their quality. The PCR products were 
then quantified spectrophotometrically and pooled, and serially 
diluted 1 0-fold in mouse genomic DNA (Clontech, Palo Alto, CA) 
at a constant concentration of 2 ng/u.1. The standard curve used for 
real-time quantitative PCR was based on serial dilutions of the pool 
of PCR products ranging from 10" 7 (10 5 copies of each gene) to 
10~ 10 (10 2 copies). This series of diluted PCR products was 
aliquoted and stored at - 80°C until use. 

The standard curve was validated by analyzing 2 known 
quantities of calibrator human genomic DNA (20 ng and 50 hg). 

PCR amplification. Amplification mixes (50 ul) contained the 
sample DNA (around 20 ng, around 6600 copies of disomic genes), 
10X TaqMan bufTer (5 ul), 200 uM dATP, dCTP, dGTP, and 400 
uM dUTP, 5 mM MgCl 2 , 1 .25 units of AmpliTaq Gold, 0.5 units of 
AmpErase uracil N-glycosylase (UNG), 200 nM each primer and 
1 00 nM probe. The thermal cycling conditions comprised 2 min at 
50°C and 10 min at 95°C. Thermal cycling consisted of 40 cycles at 
95 °C for 15 s and 65 °C for 1 min. Each assay included: a standard 
curve (from 10 5 to 10 2 copies) in duplicate, a no-template control, 
20 ng and 50 ng of calibrator human genomic DNA (Boehringer) in 
triplicate, and about 20 ng of unknown genomic DNA in triplicate 
(26 samples can thus be analyzed on a 96-well microplate). All 
samples with a coefficient of variation (CV) higher than 10% were 
retested. 

All reactions were performed in the ABI Prism 7700 Sequence 
Detection System (Perkin-Elmer Applied Biosystems), which 
detects the signal from the fluorogenic probe during PCR. 

Equipment for real-thne detection. The 7700 system has a 
built-in thermal cycler and a laser directed via fiber optical cables 
to each of the 96 sample wells. A charge-coupled-device (CDD) 
camera collects the emission from each sample and the data are 
analyzed automatically. The software accompanying the 7700 
system calculates Q and determines the starting copy number in the 
samples. 
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Determination of gene amplification. Gene amplification was 
calculated as described above. Only samples with an N value 
higher than 2 were considered to be amplified. . 

RESULTS 

To validate the method, real-time PCR was performed on 
genomic DNA extracted from 108 primary breast tumors, and. 18 
normal leukocyte DNA samples from some of the same patients. 
The target genes were the myc, ccndl and erb&2 proto-oncogenes, 
and the p-amyloid precursor protein gene (app), which maps to a 
chromosome region (2 1 q2 1 .2) in which no genetic alterations have 
been found in breast tumors (Kallionierni et ai, 1994). The 
reference disomic gene was the albumin gene {alb, chromosome 
4qll-ql3). 



Validation of the standard curve and dynamic range 
of real-time PCR 

The standard curve was constructed from PCR products serially 
diluted in genomic mouse DNA at a constant concentration of 
2 ng/ul. It should be noted that the 5 primer pairs chosen to analyze 
the 5 target genes do not amplify genomic mouse DNA (data not 
shown). Figure I shows the real-time PCR standard curve for the 
alb gene. The dynamic range was wide (at least 4 orders of 
magnitude), with samples containing as few as 10 2 copies or as 
many as 1 0 s copies. 

Copy-number ratio of the 2 reference genes (app and albj 

The app to alb copy-number ratio was determined in 1 8 normal 
leukocyte DNA samples and all 108 primary breast-tumor DNA 
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Figure 1 - Albumin (alb) gene dosage by real-time PCR. Top: Amplification plots for reactions with starting alb gene copy number ranging 
from 10 3 (A9), 10 4 (A7), 10 3 (A4) to 10 2 (A2) and a no-template control (Al). Cycle number is plotted vs. change in normalized reporter signal 
(ARn). For each reaction tube, the fluorescence signal of the reporter dye (FAM) is divided by the fluorescence signal of the passive reference dye 
(ROX), to obtain a ratio defined as the normalized reporter signal (Ks\). ARn represents the normalized reporter signal (Rn) minus the baseline 
signal established in the first 15 PCR cycles. ARn increases during PCR as alb PCR product copy number increases until the reaction reaches a 
plateau. C, (threshold cycle) represents the fractional cycle number at which a significant increase in Rn above a baseline signal (horizontal black 
line) can first be detected. Two replicate plots were performed for each standard sample, but the data for only one are shown here. Bottom: 
Standard curve plotting log starting copy number vs. C, (threshold cycle). The black dots represent the data for standard samples plotted in 
duplicate and the red dots the data for unknown genomic DNA samples plotted in triplicate. The standard curve shows 4 orders of linear dynamic 
range. 
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samples. We selected these 2 genes because they are located in 2 
chromosome regions (app, 21q21.2; alb. 4qll-q)3) in which no 
obvious genetic changes (including gains or losses) have been 
observed in breast cancers (Kallioniemi et ai, 1 994). The ratio for 
the 18 normal leukocyte DNA samples fell between 0.7 and 1.3 
(mean 1.02 i: 0.21), and was similar for the 108 primary breast- 
tumor DNA samples (0.6 to 1.6, mean 1.06 ± 0.25), confirming 
that alb and app are appropriate reference disomic genes for 
breast-tumor DNA. The low range of the ratios also confirmed that 
the nucleotide sequences chosen for the primers and probes were 
not polymorphic, as mismatches of their primers or probes with the 
subject's DNA would have resulted in differential amplification. 

myc, ccndl and erbS2 gene dose in normal leukocyte DNA 

To determine the cut-off point for gene amplification in breast- 
cancer tissue, 18 normal leukocyte DNA samples were tested for 
the gene dose (N), calculated as described in "Material and 
Methods". The N value of these samples ranged from 0.5 to 1.3 
(mean 0.84 ± 0.22) for myc, 0.7 to 1.6 (mean 1.06 ± 0.23) for 
ccndl and 0.6 to 1 .3 (mean 0.91 ±0.19) for erbBl. Since N values 
for myc, ccndl and erbBl in normal leukocyte DNA consistently 
fell between 0.5 and 1 .6, values of 2 or more were considered to 
represent gene amplification in tumor DNA. 

myc, ccndl and erbi?2 gene dose in breast-tumor DNA 

myc, ccndl and erbBl gene copy numbers in the 108 primary 
breast tumors are reported in Table I. Extra copies of ccndl were 
more frequent (23%, 25/108) than extra copies of erbBl (15%, 
16/108) and myc (10%, 11/108), and ranged from 2 to 18.6 for 
ccndl, 2 to 15.1 for erbBl, and only 2 to 4.6 for the myc gene. 
Figure 2 and Table II represent tumors in which the ccndl gene was 
amplified 16-fold (T145), 6-fold (T133) and non-amplified (T1J8). 
The 3 genes were never found to be co-amplified in the same tumor. 
erbBl and ccndl were co-amplified in only 3 cases, myc and ccndl 
in 2 cases and myc and erbBl in 1 case. This favors the hypothesis 
that gene amplifications are independent events in breast cancer. 
Interestingly, 5 tumors showed a decrease of at least 50% in the 
erbBl copy number (N < 0.5), suggesting that they bore deletions 
of the 17q21 region (the site of erbBl). No such decrease in copy 
number was observed with the other 2 proto-oncogenes. 

Comparison of gene dose determined by real-time quantitative 
PCR and Southern-blot analysis 

Southern-blot analysis of myc, ccndl and erbBl amplifications 
had previously been done on the same 1 08 primary breast tumors. A 
perfect correlation between the results of real-time PCR and 
Southern blot was obtained for tumors with high copy numbers 
(N ^ 5). However, there were cases (1 myc, 6 ccndl and 4 erbBl) 
in which real-time PCR showed gene amplification whereas 
Southem-blot did not, but these were mainly cases with low extra 
copy numbers (N from 2 to 2.9). 

DISCUSSION 

The clinical applications of gene amplification assays are 
currently limited, but would certainly increase if a simple, standard- 
ized and rapid method were perfected. Gene amplification status 
has been studied mainly by means of Southern blotting, but this 
method is not sensitive enough to detect low-level gene amplifica- 
tion nor accurate enough to quantify the full range of amplification 
values. Southern blotting is also time-consuming, uses radioactive 



TABLE 1 - DISTRIBUTION OF AMPLIFICATION LEVEL (N) FOR myc. 
ccndl AND erbBl GENES IN 108 HUMAN BREAST TUMORS 



Gene 




Amplification level (N) 




<0.5 


0.5-1.9 2-4.9 




myc 

ccndl 

erbBl 


0 

0 

5 (4.6%) 


97 (89.8%) 11 (10.2%) 
83 (76.9%) 17(15.7%) 
87 (80.6%) 8 (7.4%) 


0 

8 (7.4%) 
8 (7.4%) 



reagents and requires relatively large amounts of high-quality 
genomic DNA, which means it cannot be used routinely in many 
laboratories. An amplification step is therefore required to deter- 
mine the copy number of a given target gene from minimal 
quantities of tumor DNA (small early-stage tumors, cyiopuncture 
specimens or formalin-fixed, paraffin-embedded tissues). 

In this study, we validated a PCR method developed for the 
quantification of gene over-representation in rumors. The method, 
based on real-time analysis of PCR amplification, has several 
advantages over other PCR-based quantitative assays such as 
competitive quantitative PCR (Celi ei al. t 1 994). First, the real-time 
PCR method is performed in a closed-tube system, avoiding the 
risk of contamination by amplified products. Re-amplification of 
carryover PCR products in subsequent experiments can also be 
prevented by using the enzyme uracil N-glycosylase (UNG) 
(Longo et ai, 1990). The second advantage is the simplicity and 
rapidity of sample analysis, since no post-PCR manipulations are 
required. Our results show that the automated method is reliable. 
We found it possible to determine, in triplicate, the number of 
copies of a target gene in more than 1 00 tumors per day. Third, the 
system has a linear dynamic range of at least 4 orders of magnitude, 
meaning that samples do not have to contain equal starting amounts 
of DNA. This technique should therefore be suitable for analyzing 
formal in- fixed, paraffin-embedded tissues. Fourth, and above all, 
real-time PCR makes DNA quantification much more precise and 
reproducible, since it is based on C, values rather than end-point 
measurement of the amount of accumulated PCR product. Indeed, 
the ABI Prism 7700 Sequence Detection System enables Q to be 
calculated when PCR amplification is still in the exponential phase 
and when none of the reaction components is rate-limiting. The 
within-run CV of the C, value for calibrator human DNA (5 
replicates) was always below 5%, and the between-assay precision 
in 5 different runs was always below 10% (data not shown). In 
addition, the use of a standard curve is not absolutely necessary, 
since the copy number can be determined simply by comparing the 
C, ratio of the target gene with that of reference genes. The results 
obtained by the 2 methods (with and without a standard curve) are 
similar in our experiments (data not shown). Moreover, unlike 
competitive quantitative PCR, real-time PCR does not require an 
internal control (the design and storage of internal controls and the 
validation of their amplification efficiency is laborious). 

The only potential disavantage of real-time PCR, like al! other 
PCR-based methods and solid-matrix blotting techniques (South- 
ern blots and dot blots) is that is cannot avoid dilution artifacts 
inherent in the extraction of DNA from tumor cells contained in 
heterogeneous tissue specimens. Only FISH and immunohistochem- 
istry can measure alterations on a cell-by-cell basis (Pauletti et al. t 
1996; Slamon et ai, 1989). However, FISH requires expensive 
equipment and trained personnel and is also time-consuming. 
Moreover, FISH does not assess gene expression and therefore 
cannot detect cases in which the gene product is over-expressed in 
the absence of gene amplification, which will be possible in the 
future by real-time quantitative RT-PCR. Immunohistochemistry is 
subject to considerable variations in the hands of different teams, 
owing to alterations of target proteins during the procedure, the 
different primary antibodies and fixation methods used and the 
criteria used to define positive staining. 

The results of this study are in agreement with those reported in 
the literature. (i) Chromosome regions 4qll-qI3 and 21q21.2 
(which bear alb and app, respectively) showed no genetic alter- 
ations in the breast-cancer samples studied here, in keeping with 
the results of CGH (Kallioniemi et ai, 1994). (ii) We found that 
amplifications of these 3 oncogenes were independent events,- as 
reported by other teams (Berns et ai, 1992; Borg et ai, 1992). (Hi) 
The frequency and degree of myc amplification in out breast tumor 
DNA series were lower than those of ccndl and erbBl amplifica- 
tion, confirming the findings of Borg et ai (1992) and Courjal el ai 
(1997). (iv) The maxima of ccndl and erbBl over-representation 
were 1 8-fold and 1 5-fold, also in keeping with earlier results (about 
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25.2 



25.6 



10092 
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Figure 2 - ccndl and alb gene dosage by real-time PCR in 3 breast tumor samples: Til 8 (E12, C6, black squares), Tl 33 (Gl 1 , B4, red squares) 
and T 1 45 (A8, C8, blue squares). Given the C, of each sample, the initial copy number is inferred from the standard curve obtained during the same 
experiment Triplicate plots were performed for each tumor sample, but the data for only one are shown here. The results are shown in Table 11. 



30-fold maximum) (Bems et aL, 1 992; Borg et aL. 1 992; Courjal et 
aL. 1997). (v) The erbB2 copy numbers obtained with real-time 
PCR were in good agreement with data obtained with other 
quantitative PCR-based assays in terms of the frequency and 
degree of amplification (An et aL. 1995; Deng et aL. 1996; Valeron 



et ai t 1996). Our results also correlate well with those recently 
published by Gelmini et ai (1997), who used the TaqMan system to 
measure er6B2 amplification in a small series of breast tumors 
(n = 25), but with an instrument (LS-50B luminescence spectrom- 
eter, Perkin-Elmer Applied Biosystems) which only allows end- 
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TABLE II - EXAMPLES OF ccndl GENE DOSAGE RESULTS 
FROM 3 BREAST TUMORS' 



Tumor 




ccndl 






alb 




Hccndl/alb 


Copy 
number 


Mean 


so 


Copy 
number 


Mean 


SD 


Tit 8 


4525 






4223 










4605 


4603 


11 


4365 


4325 


89 


1.06 




4678 






4387 








TI33 


59821 






9787 










61659 


61100 


1111 


10092 


10137 


375 


6.03 . 




61821 






10533 








T145 


128563 






7321 










125892 


125392 


3448 


7762 


7672 


316 


16.34 




121722 






7933 









'For each sample, 3 replicate experiments were performed and the mean 
and the standard deviation (SD) was determined The level of ccndl gent 
amplification QAccndllalb) is determined by dividing the average ccndl 
copy number vaJ ue by the average alb copy number value. 



point measurement of fluorescence intensity. Here we report myc 
and ccndl gene dosage in breast cancer by means of quantitative 
PCR. (vi) We found a high degree of concordance between 
real-time quantitative PCR and Southern blot analysis in terms of 
gene amplification, especially for samples with high copy numbers 
(>5-fold). The slightly higher frequency of gene amplification 
(especially ccndl and erbBl) observed by means of real-time 
quantitative PCR as compared with Southern-blot analysis may be 
explained by the higher sensitivity of the former method. However, 
we cannot rule out the possibility that some tumors with a few extra 



gene copies observed in real-time PCR had additional copies of an 
arm or a whole chromosome (trisomy, tetrasomy or polysomy) 
rather than true gene amplification. . These 2 types of genetic 
alteration (polysomy and gene amplification) could be easily 
distinguished in the future by using an additional probe located on 
the same chromosome arm, but some distance from the target gene. 
It is noteworthy that high gene copy numbers have the greatest 
prognostic significance in breast carcinoma (Borg el a I., 1992* 
Slamon £?/ a/., 1987). 

Finally, this technique can be applied to the detection of gene 
deletion as well as gene amplification. Indeed, we found a 
decreased copy number of erbBl (but not of the other 2 proto- 
oncogenes) in several tumors; erbBl is located in a chromosome 
region (17q21) reported to contain both deletions and amplifica- 
tions in breast cancer (Bieche and Lidereau, 1 995). 

In conclusion, gene amplification in various cancers can be used 
as a marker of pre-neoplasia, also for early diagnosis of cancer, 
staging, prognostication and choice of treatment. Southern blotting 
is not sufficiently sensitive, and FISH is lengthy and complex. 
Real-time quantitative PCR overcomes both these limitations, and 
is a sensitive and accurate method of analyzing large numbers of 
samples in a short time. It should find a place, in routine clinical 
gene dosage. 

ACKNOWLEDGEMENTS 

RL is a research director at the Institut National de la Same et de 
la Recherche Medicale (INSERM). We thank the staff of the Centre 
Rene Huguenin for assistance in specimen collection and patient 
care. 



An, H.X., Niederacher, D. ? Beckmann, M.W. f Gohring, U.J., Scharl, A., 
Picard, F., Van Roeyen, C., SchnGrch, H.G. and Bender, H.G., <?r&B2 
gene amplification detected by fluorescent differential polymerase chain 
reaction in paraffin-embedded breast carcinoma tissues. Int. J. Cancer 
(Fred. Oncol), 64,291-297 (1995). 

Berns, E.M.J J., Klijn, J.G.M., Van Putten, W.LJ., Van Staveren, I.L., 
Portengen, H. and Foekens, J.A., c-myc amplification is a better prognos- 
tic factor than HEKl/neu amplification in primary breast cancer. Cancer 
Res., 52,11-07-1113 (1992). 

Bieche, I. and Lidereau, R., Genetic alterations in breast cancer. Genes 
Chrom. Cancer, 14,227-251 (1995). 

Borg, A., Baldetorp, B., Ferno, M., Olsson, H. and Sigurdsson, H., 
c-m>»c amplification is an independent prognostic factor in post-menopausal 
breast cancer. Int. J. Cancer, 5 1 , 687-69 1 ( 1 992). 

Celi, F.S., Cohen, M.M., Antonarakjs, S.E., Wertheimer, E., Roth, J. 
and Shuldiner, A.R., Determination of gene dosage by a quantitative 
adaptation of the polymerase chain reaction (gd-PCR): rapid detection of 
deletions and duplications of gene sequences. Genomics, 21, 304-310 
(1994). 

Courjal, F., Cuny, M., Simony -Lafontaine, J., Louasson, G., Speiser, P., 
Z ei lunger, R., Rodriguez, C and Theillet, C, Mapping of DNA 
amplifications at 15 chromosomal localizations in 1875 breast tumors: 
definition of phenotypic groups. Cancer Res.. 57, 4360-4367 (1997). 

Deng, G., Yu, M., Chen, L.C., Moore, D., Kurisu, W., Kallioniemi, A., 
Waldman, F.M., Collins, C. and Smith, H.S., Amplifications of oncogene 
crbB-2 and chromosome 20q in breast cancer determined by differentially 
competitive polymerase chain reaction. Breast Cancer Res. Treat., 40, 
271-281 (1996). 

Gelmini, S., Orlando, C, Sestini, R. t Vona, G., Pinzani, P., Ruocco, L. 
and Pazzagu, M., Quantitative polymerase chain reaction-based homoge- 
neous assay with fluorogenic probes to measure c-erfl-2 oncogene amplifi- 
cation. Clin. Chem., 43, 752-758 (1997). 

Gibson, U.E.M., Heid, C.A. and Williams, P.M., A novel method for 
real-time quantitative RT-PCR. Genome Res., 6,995-1001 ( 1 996).- 

Heid, C.A., Stevens, J., Livak, KJ. and Williams, P.M., Real-time 
quantitative PCR. Genome Res., 6, 986-994 (1 996). 

Holland, P.M., Abramson, R.D., Watson, R. and Gelfand, D.H., 
Detection of specific polymerase chain reaction product by utilizing the 5' 
to 3' exonuclease activity of TJtermus aquaticus DNA polymerase. Proc. 
not. Acad. Set. (Wash.), 88, 7276-7280 ( 1 99 1 ). 



Kallioniemi, A.. Kallioniemi, O.P., Piper, J., Tanner, M., Stokkes, T 
Chen, L., Smith, H.S., Pinkel, D. t Gray, J.W. and Waldman, F.M.! 
Detection and mapping of amplified DNA sequences in breast cancer by 
comparative genomic hybridization. Proc. nat. Acad. Sci. (Wash.) 91. 
2156-2160(1994). 

Lee, L.G., Connell, C.R. and Bioch, W., Allelic discrimination by 
nick-translauon PCR with fluorogenic prc-'r>e. Nucleic Acids Res., 21, 
3761-3766(1993). 

Longo, N., Berninger, N.S. and Hartley, J.L.. Use of uracil DNA 
glycosylase to control carry-over contamination in polymerase chain 
reactions. Gene, 93, 125-128 (1990). 

Muss, H.B., Thor, A.D., Berry, D.A., Kute, T„ Liu, E.T., Koerner, F, 
Cirrincione, C.T., Budman, D.R., Wood, W.C., Barcos, M. and Hender- 
son, I.C., c-er£B-2 expression and response to adjuvant therapy in- women 
with node-positive early breast cancer. New Engl. J. Med., 330. 1260-1266 
(1994). 

Pauletti, G., Godolphtn, W., Press, M.F. and Salmon, D.J., Detection and 
quantification of HER-2/new gene amplification in human breast cancer 
archival material using fluorescence in situ hybridization. Oncogene 13. 
63-72(1996). 6 

Piatak, M. ( Luk, ICC, Wiluams, B. and Lifson, J.D., Quantitative 
competitive polymerase chain reaction for accurate quantitation of HIV 
DNA and RN A species. Biotechniques, 14, 70-80 (1993). 

Schuuring, E., Verhoeven, E., Van Tinteren, H., Peterse, J.L., Nunnih, 
B., Thunnissen, F.B.J.M., Devilee, P., Cornelisse, C.J., Van de Vuver, 
M.J., Mooi, WJ. and Michalides, R.J.A.M., Amplification of genes within 
the chromosome llq!3 region is indicative of poor prognosis in patients 
with operable breast cancer. Cancer Res., 52, 5229-5234 (1992). 

Slamon, D.J., Clark, G.M., Wong, S.G.. Levin, W.S., Ullrich, A. and 
McGuire, W.L., Human breast cancer correlation of relapse and survival 
with amplification of the HER-2/n«/ oncogene. Science, 235, 177-182 
(1987). 

Slamon, D J., Godolphin, W., Jones, L.A., Holt, J.A., Wong, S.G., Keith, 
D.E., Levin, W.J., Stuart, S.G., Udove, J., Ullrich, A. and Press, M.F., 
Studies of the HER-2/neu proto-oncogene in human breast and ovarian 
cancer. Science, 244, 707-712 (1989). 

Valeron, P.F., Chirino, R., Fernandez, L., Torres, S., Navarro, D., 
Aguiar, J., Cabrera, JJ., Diaz-Chico, B.N. and Diaz-Chjco, J.C., 
Validation of a differential PCR and an ELISA procedure in studying' 
HER-2/neu status in breast cancer. Int. J. Cancer, 65, 129-133 (1996). 



■ IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



/vppncani 


/vsnKenazi et ai. . 


Group Art Unit 1647 


App. No. 


: 09/903,925 


CERTIFICATE OF EXPRESS MAILING 


Filed 
For 


: July 11,2001 

: SECRETED AND 
TRANSMEMBRANE 
POLYPEPTIDES AND NUCLEIC 
ACIDS ENCODING THE SAME 


I hereby certify that this correspondence is 
being deposited with the United States 
Postal Service with sufficient postage as 
first class mail in an envelope addressed to 
Commissioner of Patents, Washington 
D.C. 20231 on: 


Examiner 


Hamud, Fozia M 


(Date) 



Commissioner of Patents 

P.O. Box 1450 

Alexandria, VA 22313-1450 

DECLARATION OF AVT A.SHKENAZT. Ph.D UNDER 37 C.F.R. S 1.132 
I, Avi Ashkenazi, Ph.D. declare and say as follows: - 

1. I am Director and Staff Scientist at the Molecular Oncology Department of 
Genentech,. Inc., South San Francisco, CA 94080. 

2. I joined Genentech in 1 988 as a postdoctoral fellow. Since then, I have 
investigated a variety of ceDular signal transduction mechanisms, including apoptosis, and have 
developed technologies to modulate such mechanisms as a means of therapeutic intervention in 
cancer and autoimmune disease. I am currently involved in the investigation of a series of 
secreted proteins over-expressed in tumors, with the aim to identify useful targets for the 
development of therapeutic antibodies for cancer treatment 

3. My scientific Curriculum Vitae, including my list of publications, is attached to 
and forms part of this Declaration (Exhibit A). 

4. Gene amplification is a process in which chromosomes undergo changes to 
contain multiple copies of certain genes that normally exist as a single copy, and is an important 
factor in the pathophysiology of cancer. Amplification of certain genes (e.g., Myc or Her2/Neu) 



gives cancer cells a growth or survival advantage relative to normal cells, and might also provide 
a mechanism of tumor cell resistance to chemotherapy or radiotherapy. 

5. If gene amplification results in over-expression of the mRNA and the 
corresponding gene product, then it identifies that gene product as a promising target for cancer 
therapy, for example by the therapeutic antibody approach. Even in the absence of over- 
expression of the gene product, amplification of a cancer marker gene - as detected, for example, 
by the reverse transcriptase TaqMan® PGR or the fluorescence in situ hybridization (FISH) 
assays -is useful in the diagnosis or classification of cancer, or in predicting or monitoring the 
efficacy of cancer therapy. An increase in gene copy number can result not only from 
intrachromosomal changes but also from cbromosomal aneuploidy. It is important to understand 
that detection of gene amplification can be used for cancer diagnosis even if the determination 
includes measurement of chromosomal aneuploidy. Indeed, as long as a significant difference 
relative to normal tissue is detected, it is irrelevant if the signal originates from an increase in the 
number of gene copies per chromosome and/or an abnormal number of chromosomes. 

6. I understand that according to the Patent Office, absent data demonstrating that 
the increased copy number of a gene in certain types of cancer leads to increased expression of 
its product, gene amplification data are insufficient to provide substantial utility or well 
established utility for the gene product (the encoded polypeptide), or an antibody specifically 
binding the encoded polypeptide. However, even when amplification of a cancer marker gene 
does not result in significant over-expression of the corresponding gene product, this very 
absence of gene product over-expression still provides significant information for cancer 
diagnosis and treatment. Thus, if over-expression of the gene product does not parallel gene 

. amplification in certain tumor types but does so in others, then parallel monitoring of gene 
amplification and gene product over-expression enables more accurate tumor classification and 
hence better determination of suitable therapy. In addition, absence of over-expression, is crucial 
information for the practicing clinician.. If a gene is amplified but the corresponding gene 
product is not over-expressed, the clinician accordingly will decide riot to treat a patient with 
agents that target that gene product. 

7. I hereby declare that all statements made herein of my own knowledge are true 
and that all statements made on information or belief are believed to be true, and further that 
these statements were, made with the knowledge that willful false statements and the like so 



made are punishable by fine or imprisonment, or both, under. Section 1001 of Title 18 of the 
United States Code and that such willful statements may jeopardize the validity of the 
application or any patent issued thereon. 

By: /fvnA4l^-^ . Date: g7j£^> 

Avi Ashkenazi, Ph.D. ' ■ . 



SV 45528 lvl 

9/12/03 3:06 PM (39780.7000) 



CURRICULUM VITAE 
Avi Ashkenazi 
July 2003 



Personal: 
Date of birth: 
Address: 
Phone: 
Fax: 
Email: 

Education: 

1983: 
1986: 

Employment: 

1983-1986: 

1985- 1986: 

1986- 1988: 

1988 - 1989: 

1989-1993: 
1994-1996: 

1996- 1997: 

1997- 1990: 
1999 -2002: 
2002-present: 

Awards: 

1988: 



29 November, 1956 

1 456 Tarrytown Street, San Mateo, CA 94402 
(650) 578-9199 (home); (650) 225-1853 (office) 
(650) 225-6443 (office) 
aa@gene.com 

B.S. in Biochemistry, with honors, Hebrew University, Israel 
Ph.D. in Biochemistry, Hebrew University, Israel 



Teaching assistant, undergraduate level course in Biochemistry 
Teaching assistant, graduate level course on Signal Transduction 
Postdoctoral fellow, Hormone Research Dept., UCSF, and 
Developmental Biology Dept., Genentech, Inc., with J. Ramachandran 
Postdoctoral fellow, Molecular Biology Dept., Genentech, Inc., 
with D. Capon 

Scientist, Molecular Biology Dept., Genentech, Inc. 
Senior Scientist, Molecular Oncology Dept., Genentech, Inc. 
Senior Scientist and Interim director, Molecular Oncology Dept., 
Genentech, Inc. 

Senior Scientist and preclinical project team leader, Genentech, Inc. 

Staff Scientist in Molecular Oncology, Genentech, Inc. 

Staff Scientist and Director in Molecular Oncology, Genentech, Inc. 

First prize, The Boehringer Ingelheim Award 



1 



Editorial: 
Editorial Board Member. Current Biology 
Associate Editor, Clinical Cancer Research. 
Associate Editor, Cancer Biology and Therapy. 

Refereed papers: 

1 . Gertler, A., Ashkenazi, A., and Madar, Z. Binding sites for human growth 
hormone and ovine and bovine prolactins in the mammary gland and liver of the 
lactating cow. Mol Cell Endocrinol 34, 5U57 (1984). 

2. Gertler, A., Shamay, A., Cohen, N., Ashkenazi, A., Friesen, H., Leyanon, A, 
Gorecki, M., Aviv, H., Hadari, D., and Vogel, T. Inhibition of lactogenic 
activities of ovine prolactin and human growth hormone (hGH) by a novel form of 
a modified recombinant hGH. Endocrinology 118, 720-726 (1986). . 

3. Ashkenazi, A., Madar, Z., and Gertler, A. Partial purification and characterization 
of bovine mammary gland prolactin receptor. Mol Cell Endocrinol 50, 79-87 
(1987). 

4. Ashkenazi. A., Pines, M., and Gertler, A. Down-regulation of lactogenic 
hormone receptors in Nb2 lymphoma cells by cholera toxin. Biochemistry 
Internatl 14, 1065-1072 (1987). 

5 . Ashkenazi, A., Cohen, R., arid Gertler, A. Characterization of lactogen receptors 
in lactogenic hormone-dependent and independent Nb2 lymphoma cell lines. 
FEBS Lett. 210,51-55 (1987).. 

6. Ashkenazi. A, Vogel, T., Barash, L, Hadari, D., Levanon, A, Gorecki, M., and 
Gertler, A. Comparative study on in vitro and in vivo modulation of lactogenic 
and somatotropic receptors by native human growth hormone and its modified 
recombinant analog. Endocrinology 121, 414-419 (1987). 

7. Peralta, E., Winslow, J., Peterson, G., Smith, D., Ashkenazi, A., Ramachandran, 
J., Schimerlik, M., and Capon, D. Primary structure and biochemical properties 
of an M2 muscarinic receptor. Science 236, 600-605 (1987). 

8. Peralta, E. Ashkenazi, A., Winslow, J., Smith, D., Ramachandran, J., and Capon, 
. D. J. Distincnt primary structures, Ugand-binding properties and tissue-specific 

expression of four human muscarinic acetylcholine receptors. EMBO J. 6, 3923- 
3929(1987). 

9. Ashkenazi, A., Winslow, J., Peralta, E., Peterson, G., Schimerlik, JVL, Capon, D., 
and Ramachandran, J. An M2 muscarinic receptor subtype coupled to both 
adenylyl cyclase and phosphoinositide turnover. Science 238, 672-675 (1987). 



10. Pines, M, Ashkenazi, A., Cohen-Chapnik, N., Binder, L., and Gertler, A. 
Inhibition of the proliferation of Nb2 lymphoma cells by femtomolar 
concentrations of cholera toxin and partial reversal of the effect by 12-o- 
tetradecanoyl-phorbol-13-acetate. J. Cell. Biochem. 37, 119-129 (1988). 

1 1 . Peralta, E. Ashkenazi, A., Winslow, J. Ramachandran, J., and Capon, D. 
Differential regulation of PI hydrolysis and adenylyl cyclase by muscarinic 
receptor subtypes. Nature 334, 434-437 (1988). 

12. Ashkenazi., A. Peralta, E., Winslow, J., Ramachandran, J., and Capon, D. 
Functionally distinct G proteins couple different receptors to PI hydrolysis in the 
same cell. Cell 56, 487-493 (1989). 

13. Ashkenazi. A., Ramachandran, J., and Capon, D. Acetylcholine analogue 
stimulates DNA synthesis in brain-derived cells via specific muscarinic 
acetylcholine receptor subtypes. Nature 340, 146-150(1989). 

14. Lammare, D., Ashkenazi. A.. Fleury, S., Smith, D., Sekaly, R, and Capon, D. 
The MHC-binding and gpl20-binding domains of CD4 are distinct and separable. 
Science 245, 743-745 (1989). 

15. Ashkenazi.. A.. Presta, L., Marsters, S., Camerato, T., Rosenthal, K., Fendly, B:, 
and Capon, D. Mapping the CD4 binding site for human immunodefficiency 
virus type 1 by alanine-scanning. mutagenesis. Proc. Natl. Acad. Sci. USA. 87, 
7150-7154(1990). 

16. Chamow, S., Peers, D., Byrn, R-, Mulkerrin, M., Harris, R, Wang, W., Bjorkman, 
P., Capon, D., and Ashkenazi, A. Enzymatic cleavage of a CD4 immunoadhesin 
generates crystallizable, biologically active Fd-like fragments. Biochemistry 29, 
9885-9891 (1990). 

17. Ashkenazi. A., Smith, D., Marsters, S., Riddle, L., Gregory, T., Ho, D., and 
Capon, D . Resistance of primary isolates of human immunodefficiency virus type 
1 to soluble CD4 is independent of CD4-rgpl20 binding affinity. Proc. Natl. 
Acad. Sci. USA. 88, 7056-7060 (1991). 

1 8. Ashkenazi. A., Marsters, S., Capon, D., Chamow, S., Figari., I., Pennica, D., 
Goeddel., D., Palladino, M., and Smith, D. Protection against endotoxic shock by 
a tumor necrosis factor receptor immunoadhesin. Proc. Natl. Acad. Sci. USA. 88, 
10535-10539(1991). 

19. Moore, J., McKeating, J., Huang, Y., Ashkenazi. A ., and Ho, D. Virions of 
primary HTV-1 isolates resistant to sCD4 neutralization differ in sCD4 affinity and 
glycoprotein gpl20 retention from sCD4-sensitive isolates. J. Virol 66, 235-243 
(1992). 



3 



20. Jin, H., Oksenberg, D., Ashkenazi, A., Peroutka, S., Duncan, A., Rozmahel., R., 
Yang, Y., Mengod, G., Palacios, J., and OT>owd, B. Characterization of the 
human 5-hydroxytryptamineiB receptor. /. Biol Chem. 267, 5735-5738 (1992). 

21 . Marsters, A., Frutkin, A., Simpson, N., Fendly, B. and Ashkenazi, A. 
Identification of cysteine-rich domains of the type 1 tumor necrosis receptor 
involved in ligand binding. J. Biol: Chem. 267, 5747-5750 (1992). 

22. Chamow, S., Kogan, T., Peers, D., Hastings, R., Bym, R., and Ashkenazi, A. 
Conjugation of sCD4 without loss of biological activity via a novel carbohydrate- 
directed cross-linking reagent. 1 Biol Cherru 267, 15916-15922 (1992). 

23. Oksenberg, D., Marsters, A., O'Dowd, B., Jin, H., Havlik, S., Peroutka, S., and 
Ashkenazi A. A single amino-acid difference confers major pharmacologic 
variation between human and rodent 5-HTib receptors. Nature 360, 161-163 

(1992) . 

24. Haak-Frendscho, M., Marsters, S., Chamow, S., Peers, D., Simpson, N., and 
Ashkenazi, A. Inhibition of interferon y by an interferon y receptor 
immunoadhesin. Immunology 79, 594-599 (1993). 

25. Penica, D.,.Lam, V., Weber, R., Kohr, W., Basa, L., Spellman, M., Ashkenazi, 
Shire, S., and Goeddel, D. Biochemical characterization of the extracellular 
domain of the 75-kd tumor necrosis factor receptor. Biochemistry 32, 3131-3 138. 

(1993) . 

26. Barfod, L., Zheng, Y., Kuang, W., Hart, M., Evans^ T., Cerione, R., and 

. Ashkenazi, A. Cloning and expression of a human CDC42 GTPase Activating 
Protein reveals a functional SH3-binding domain. J. Biol Cherru 26$, 26059- 
26062(1993). 

27. Chamow, S., Zhang, D., Tan, X., Mhtre, S., Marsters, S., Peers, D., Bym, R., 
Ashkenazi, A., and Yunghans, R. A humanized bispecific immunoadhesin- 
antibody that retargets CD3+ effectors to kill HIV- 1 -infected cells. J. Immunol 
153,4268-4280(1994). 

28. Means, R., Krahtz, S., Luna, J., Marsters, S., and Ashkenazi, A. Inhibition of 
murine erythroid colony formation in vitro by iterferon y and correction by 
interferon y receptor immunoadhesin. Blood 83, 91 1-915 (1994). 

29. Haak-Frendscho, M., Marsters, S., Mordenti, J., Gillet, N., Chen, S., 
and Ashkenazi, A. Inhibition of TNF by a TNF receptor immunoadhesin: 
comparison with an anti-TNF mAb. 7. Immunol 152, 1347-1353 (1994). 



4 



30. Chamow, S., Kogan, T., Venuti, M., Gadek, T., Peers, D., Mordenti, J., Shak, S., 
and Ashkenazi, A. Modification of CD4 immunoadhesin with monomethoxy- 
PEG aldehyde via reductive alkilation. Bioconj. Chenu 5, 133-140 (1994). 

31. Jin, H., Yang, R, Marsters, S., Bunting, S., Wurm, F., Chamow, S., and 
Ashkenazi, A. Protection against rat endptoxic shock by p55 tumor necrosis factor 
(TNF) receptor immunoadhesin: comparison to anti-TNF monoclonal antibody. 
Infect. Diseases 170, 1323-1326 (1994). 

32. Beck, J., Marsters, S., Harris, R., Ashkenazi, A., and Chamow, S. Generation of 
soluble interleukin-1 receptor from an immunoadhesin by specific cleavage. Mol. 
Immunol 31, "1335-1344 (1994). 

33. Pitti, B., Marsters, M., Haak-Frendscho, M., Osaka, G., Mordenti, J., Chamow, S., 
and Ashkenazi, A. Molecular and biological properties of an interleukin-1 
receptor immunoadhesin. Mol Immunol 31, 1345-1351 (1994). 

34. Oksenberg, D., Havhk, S., Peroutka, S., and Ashkenazi, A. The third intracellular 
loop of the 5-HT2 receptor specifies effector coupling. 7. Neurocheiru 64, 1440- 
1447(1995). 

35. Bach, E., Szabo, S., Dighe, A., Ashkenazi, A., Aguet, M., Murphy, K., and 
Schreiber, R. Ligand-induced autoregulation of EFN-y receptor P chain expression 
in T helper cell subsets. Science 270, 1215-1218 (1995). 

36. Jin, H., Yang, R., Marsters, S., Ashkenazi, A., Bunting, S., Marra, M., Scott, R., 
and Baker, J. Protection against endotoxic shock by bacterici.dal/permeability- 
increasing protein in rats. Clin. Invest. 95, 1947-1952 (1995). . 

37. Marsters, S., Penica, D., Bach, E., Schreiber, R., and Ashkenazi, A. Interferon y 
signals via a high- affinity multisubunit receptor complex that contains two types 
of polypeptide chain: Proc. Natl Acad. Set USA.-92, 5401-5405 (1995). 

3 8 . Van Zee, K., Moldawer, L., Oldenburg, H., Thompson, W., Stackpole, S., 

Montegut, W., Rogy, M., Meschter, C, Gallati, H., Schiller, C, Richter, W., 
Loetcher, H., Ashkenazl A, , Chamow, S., Wurm, F., Calvano, S., Lowry, S., and 
. . Lesslauer, W. Protection against lethal E. coli bacteremia in baboons by 
pretreatment with a 55-kDa TNF receptor-Ig fusion protein, Ro45-2081. J: 
Immunol. 156, 2221-2230 (1996). 

39. Pitti, R., Marsters, S., Ruppert, S., Donahue, C, Moore, A., and Ashkenazi, A . 
Induction of apoptosis by Apo-2 Ligand, a new member of the tumor necrosis " 
factor cytokine family. J. Biol. Chem. Ill, 12687-12690 (1996).. 



5 



4G. Marsters, S., Pitti, R., Donahue, C, Rupert, S., Bauer, K., and Ashkenazi, A . 

Activation of apoptosis by Apo-2 ligand is independent of FADD but blocked by 
CrmA. Curr. Biol 6, 1669-1676 (1996). 

41. Marsters, S., Skubatch, M., Gray, C, and Ashkenazi, A . Herpesvirus entry 
mediator, a novel member of the tumor necrosis factor receptor family, activates 
the NF-kB and AP-1 transcription factors. J. Biol Chem. Ill, 14029-14032 
(1997). 

42. Sheridan, J., Marsters, S., Pitti, R., Gurney, A., Skubatch, M., Baldwin, D., 
Ramakrishnan, L., Gray, C, Baker, K., Wood, W.L, Goddard, A., Godowski, P., and 
Ashkenazi, A. Control of TRAIL-induced apoptosis by a family of signaling and 
decoy recep tors. Science 277, 818-821 (1997). 

43. . Marsters, S., Sheridan, J., Pitti, R., Gurney, A., Skubatch, M., Balswin, D., Huang, A, 

Yuan, J., Goddard, A., Godowski, P., and Ashkenazi, A. A novel receptor for 
Apo2I7TRAIL contains a truncated death domain. Curr. Biol 7, 1003-1006 (1997). 

44. Marsters, A., Sheridan, J., Pitti, R., Brush, J., Goddard, A., and Ashkenazi, A. 
Identification of a ligand for the death-domairi-containing receptor Apo3. Curr. Biol. 
8,525-528 (1998). 

45. Rieger, J., Naumann, U., Glaser, T., Ashkenazi, A ., and Weller, M. Apo2 ligand: 
a novel weapon against malignant glioma? FEBS Lett. 427, 124-128 (1998). 

46. Pender, S., Fell, J., Chamow, S., Ashkenazi A ., and MacDonald, T. A p55 TNF 
receptor immunoadhesin prevents T cell mediated intestinal injury by inhibiting 
matrix metalloproteinase production. /. Immunol 160, 4098-4103 (1998). 

47. Pitti, R., Marsters, S., Lawrence, D., Roy, Kischkel, F., M., Dowd, P., Huang, A., 
Donahue, C, Sherwood, S., Baldwin, D., Godowski, P., Wood, W., Gurney, A., 
Hillan, K., Cohen, R., Goddard, A., Botstein, D., and Ashkenazi, A. Genomic 
amplification of a decoy receptor for Fas ligand in lung and colon cancer. Nature 
396,699-703 (1998). 

48. Mori, S., Marakami-Mori, K., Nakamura, S., Ashkenazi, A ., and Bonavida, B. 
Sensitization of AIDS Kaposi's sarcoma cells to Apo-2 ligand-induced apoptosis 
by actinomycin D. Immunol 162, 5616-5623 (1999). 

49. Gurney, A. Marsters, S., Huang, A., Pitti, R., Mark, M.,. Baldwin, D., Gray, A., 
Dowd, P., Brush, J., Heldens, S., Schow, P., Goddard, A., Wood, W., Baker, K., 
Godowski, P., and Ashkenazi. A: Identification of a new member of the tumor 
necrosis factor family and its receptor, a human ortholog of mouse GITR. Curr. 
Biol 9,215-218 (1999). 



6 



50. Ashkenazi, A ., Pai, R., Fong, s., Leung, S., Lawrence, D., Marsters, S., Blackie, 
C, Chang, L., McMurtrey, A., Hebert, A., DeForge, L., Khoumenis, L, Lewis, D., 
Harris, L., Bussiere, J., Koeppen, H., Shahrokh, Z., and Schwall, R. Safety and 
anti-tumor activity of recombinant soluble Apo2 ligand J. Clin. Invest. 104, 155- 
162(1999). 

51 . Chuntharapai, A., Gibbs, V., Lu, L, Ow, A., Marsters, S., Ashkenazi, A., De Vos, 
A., Kim, KJ. Determination of residues involved in ligand binding and signal 
transmission in the human IFN-a receptor 2. J. Immunol. 163, 766-773 (1999). 

52. Johnsen, A.-C, Haux, J., Steinkjer, B., Nonstad, U., Egeberg, K., Sundan, A., 
Ashkenazi, A., and Espevik, T. Regulation of Apo2L/TRAIL expression in NK 
cells - involvement in NK cell-mediated cytotoxicity. Cytokine 11, 664-672 
(1999). 

53. Roth, W., Iserimann, S., Naumann, U., Kugler, S., Bahr, M., Dichgans, J^. 
Ashkenazi, A., and Weller, M. Eradication of intracranial human malignant 
glioma xenografts by Apo2L/TRAIL. Biochem, Biophys. Res. Commun. 265, 479- 
483 (1999). 

54. Hymowitz, S.G., Christinger, H.W., Fuh, G., Ultsch, M ., O'Connell, M., Kelley, 
. R.F., Ashkenazi, A. and de Vos, A.M. Triggering Cell Death: The Crystal 

Structure of Apo2L/TRAIL in a Complex with Death Receptor 5. Molec. Cell 4, 
563-571 (1999). 

55. Hymowitz, S.G., O'Connel, M.P., Utsch, M.H., Hurst, A., Totpai, KL, Ashkenazi, 
A, de Vos, A.M., Kelley, R.F. A unique zinc-binding site revealed by a high- 
resolution X-ray structure of homotrimeric Apo2L/TRAIL. Biochemistry 39, 633- 
640(2000). 

56. Zhou, ; Q., Fukushima, P., DeGraff, W., Mitchell, J,B., Stetler-Stevenson, M., 
AshkeiiazirA., and Steeg, P.S. Radiation and the Apo2L/TRAIL apoptotic 
pathway preferentially inhibit the colonization of premalignant human breast 
cancer cells overexpressing cyclin Dl. Cancer Res. 60, 261 1-2615 (2000). 

57. Kischkei, F.C., Lawrence, D. A., Chuntharapai, A, Schow, P., Kim, J., and 
Ashkenazi, A. Apo2L/TRAJL-dependent recruitment of endogenous FADD and 
Caspase-8 to death receptors 4 and 5. Immunity 12, 61 1-620 (2000). 

58. Yan, M., Marsters, S A., Grewal, LS., Wang, H., * Ashkenazi, A., and *Dixit, 
V.M. Identification of a receptor for BlyS demonstrates a crucial role in humoral 
immunity. Nature Immunol. 1, 37-41 (2000). 



5 9. Marsters, S.A., Yan, M., Pitti, R.M., Haas, P.E., Dixit, V.M., and Ashkenazi. A. 
Interaction of the TNF homologues BLyS and APRIL with the TNF receptor 
homologues BCMA and TACI. Curr. Biol. 10, 785-788 (2000). 

60. Kischkel, F.C., and Ashkenazi. A . Combining enhanced metabolic labeling with 
immunoblotting to detect interactions of endogenous cellular proteins. 
Biotechniques 29, 506-512 (2000). 

61. Lawrence, D., Shahrokh, Z., Marsters, S., Achilles, K., Shih, D. Mounho, B., 
Hillan, K., Totpal, K. DeForge, L., Schow, P., Hooley, J., Sherwood, S., Pai, R., 
Leung, S., Khan, L., Gliniak, B., Bussiere, J., Smith, C, Strong S., Kelley, S., 
Fox, J., Thomas, D., and Ashkenazi, A. Differential hepatocyte toxicity of 

"recombinant Apo2L/TRAIL versions. Nature Med. 7, 383-385 (2001). 

62. Chuntharapai, A, Dodge, K., Grimmer, K, Schroeder, K., Martsters, S.A., 
Koeppen, H-, Ashkenazi. A ., and Kim, K.J. Isotype-dependent inhibition of 
tumor growth in vivo by monoclonal antibodies to death receptor 4. J. Immunol. 
166,4891-4898(2001). 

63. Pollack, I.F., Erff, M., and Ashkenazi, A . Direct stimulation of apoptotic 
signaling by soluble Apo2L/tumor necrosis factor-related apoptosis-inducing 
ligand leads to selective killing of glioma cells. Clin. Cancer Res. 7, 1362-1369 
(2001). 

64. Wang, H., Marsters, S. A, Baker, T., Chan, B., Lee, W.P., Fu, L., Tumas, D., Yan, 
M., Dixit, V.M., * Ashkenazi. A ., and *Grewai, I.S. TACI-ligand interactions are 
required for T cell activation and collagen-induced arthritis in mice. Nature 
Immunol. 2, 632-637 (2001). 

65 . Kischkel, F.C., Lawrence, D. A., Tinel, A., Virmani, A, Schow, P., Gazdar, A., 
Blenis, J., Amort, D., and Ashkenazi, A . Death receptor recruitment of 
endogenous caspase-10 and apoptosis initiation in the absence of caspase-8. J. 
Biol. Chem. 276, 46639-46646 (2001), 

66. LeBlanc, H., Lawrence, D.A., Varfolomeev, E., Totpal, K., Morlan, J., Schow, P., 
Fong, S., Schwall, R., Sinicropi, D., and Ashkenazi. A T umor cell resistance to 
death receptor induced apoptosis through mutational inactivation of the 
proapoptotitc Bcl-2 homolog Bax. Nature Med. 8, 274-281 (2002). 

67. Miller, K, Meng, G-, Liu, J., Hurst, A, Hsei, V., Wong, W-L., Ekert, R., 
Lawrence, D., Sherwood, S., DeForge, L., Gaudreault., Keller, G., SUwkowski, 
M., Ashkenazi, A , and Presta, L. Design, Construction, and analyses of 
multivalent antibodies. J. Immunol. 170, 4854-4861 (2003). 



8 



68. Varfolomeev, E., Kischkel, F., Martin, F, Wanh, H., Lawrence, D., Olsson, C, 
Tom, L., Erickson, S., French, D., Schow, P., Grewal, I. and Ashkenazi. A. 
Immune system development in APRIL knockout mice. Submitted. 

Review articles: 

1 . Ashkenazi, A., Peralta, E:, Winslow,. J., Ramachandran, J., and Capon, D., J. 
Functional role of muscarinic acetylcholine receptor subtype diversity. Cold 
Spring Harbor Symposium on Quantitative Biology. LIH, 263-272 (.1988). 

2. Ashkenazi, A ., Peralta, E., Winslow, J., Ramachandran, L, and Capon, D. 
Functional diversity of muscarinic receptor subtypes in cellular signal 
transduction and growth. Trends Pharmacol. Set Dec Supplement, 12-21 (1989). 

3. Chamow, S., Duliege, A., Ammann, A., Kahn, L, Allen, D., Eichberg, J., Byrn, 
R., Capon, D., Ward, R;, and Ashkenazi, A . CD4 immunoadhesins in anti-HIV 

. therapy: new developments. Int. J. Cancer Supplement 7, 69-72 (1992). 

4. Ashkenazi, A ., Capon, andD. Ward, R. Immunoadhesins. Int. Rev.. Immunol. 10, 
217-225 (1993). 

5 . Ashkenazi, A ., and Peralta, E. Muscarinic Receptors: In Handbook of Receptors 
and Channels. (S. Peroutka, ed.), CRC Press, Boca Raton, Vol. I, p. 1-27, (1994). 

6. Krantz, S. B., Means, R. T., Jr., Lina, J., Marsters, S. A., and Ashkenazi, A . 
Inhibition of erythroid colony formation in vitro by gamma interferon. In 
Molecular Biology of Hematopoiesis (N. Abraham, R. Shadduck, A. Levine F. 
Takaku, eds.) Intercept Ltd. Paris, Vol. 3, p. 135-147 (1994). 

7. Ashkenazi, A . Cytokine neutralization as a potential therapeutic approach for 
SIRS and shock. Biotechnology in Healthcare 1, 197-206 (1994). . 

8. Ashkenazi, A . ; and Chamow, S. M. Immunoadhesins: an alternative to human 
monoclonal antibodies. Immunomethods: A companion to Methods in 
Enzimology 8, 104-115 (1995). 

9. Chamow, S., and Ashkenazi, A . Immunoadhesins: Principles and Applications. 
Trends Biotech. 14, 52^60 (1 996). 

.10. Ashkenazi; A ., and Chamow, S. M. Immunoadhesins as research tools and 
therapeutic agents. Curr. Opin. Immunol. 9, 195-200 (1997). 

11. Ashkenazi, A ., and Dixit, V. Death receptors: signaling and modulation. Science 
281,1305-1308 (1998). 

12. Ashkenazi, A ., and Dixit, V. Apoptosis control by death and decoy receptors. 
Curr. Opin. Cell Biol. 11,255-260 (1999). 



13. AshkenazL A . Chapters on Apo2I/TRAIL; DR4, DR5, DcRl, DcR2; and DcR3. 
Online Cytokine Handbook (www.apnet.coni/cvtokinereference/) . 

1 4. Ashkenazi, A . Targeting death and decoy receptors of the tumor necrosis factor 
superfamily. Nature Rev, Cancer 2, 420-430 (2002). 

15. LeBlanc, H. and Ashkenazi, A . Apoptosis signaling by Apo2L/TRAIL. Cell Death 
and Differentiation 10, 66^75 (2003). 

16. Almasan, A. and Ashkenazi, A . Apo2L/TRAIL: apoptosis signaling, biology, and 
potential for cancer therapy. Cytokine and Growth Factor Reviews 14, 337-348 
(2003). . • * 

Book: 

Antibody Fusion Proteins (Chamow, S., and Ashkenazi, A ., eds., John Wiley and 
Sons Inc.) (1999). 

Talks: 

1 . Resistance of primary HIV isolates to CD4 is independent of CD4-gpl20 binding 
affinity. UCSD Symposium, HIV Disease: Pathogenesis and Therapy. 
Greenelefe, FL, March 1991. 

2. Use of immuno-hybrids to extend the half-life of receptors. IBC conference on 
Biopharmaceutical Halflife Extension. New Orleans, LA, June 1992. 

3 . Results with TNF receptor Immunoadhesins for the Treatment of Sepsis. IBC 
conference on Endotoxemia and Sepsis. Philadelphia, PA, June 1992. 

4. Immunoadhesins: an alternative to human antibodies. IBC conference on 
Antibody Engineering. San Diego, CA,.December 1993. 

5 . Tumor necrosis factor receptor: a potential therapeutic for human septic shock. 
American Society for Microbiology Meeting, Atlanta, GA, May 1993. 

6. Protective efficiacy of TNF receptor immunoadhesin vs anti-TNF monoclonal 
antibody in a rat model for endotoxic shock. 5th International Congress on TNF. 
Asilomar, CA, May 1994. 

7. Interferon-y signals via a multisubunit receptor complex that contains two types. of 
polypeptide chain. American Association of Immunolo gists Conference. San 
Franciso, CA, July 1995. 

8. Immunoadhesins: Principles and Applications. Gordon Research Conference on 
Drug Delivery in Biology and Medicine. Ventura, CA, February 1996. 



10 



9. Apo-2 Ligand, a new member of the TNF family that induces apoptosis in tumor 
cells. Cambridge Symposium on TNF and Related Cytokines in Treatment of 
Cancer. Hilton-Head, NC, March 1996. 

1 0. Induction of apoptosis by Apo2 Ligand. American Society for Biochemistry and 
Molecular Biology, Symposium on Growth Factors and Cytokine Receptors. New 
Orleans, LA, June, 1996. 

1 1 . Apo2 ligand, an extracellular trigger of apoptosis. 2nd Clontech Symposium, 
Palo Alto, CA, October 1996. 

12. Regulation of apoptosis by members of the TNF ligand and receptor families. 
Stanford University School of Medicine, Palo Alto, CA, December 1996. 

1 3 . Apo-3: ahovel receptor that regulates cell death and inflammation. 4th 
International Congress on Immune Consequences of Trauma, Shock, and Sepsis. 
Munich, Germany, March 1997. 

14. New members of the TNF ligand and receptor families that regulate apoptosis, 
inflammation, and immunity. UCLA School of Medicine, LA, CA, March 1997. 

1 5 . Immunoadhesins: ah alternative to monoclonal antibodies. 5th World Conference 
on Bispecific Antibodies. Volendam, Holland, June 1997. 

1 6 . Control of Apo2L signaling. Cold Spring Harbor Laboratory Symposium on 
Programmed Cell Death. Cold Spring Harbor, New York. September, 1997. 

17. Chairman and speaker, Apoptosis Signaling session. EC's 4th Annual 
Conference on Apoptosis. San Diego, CA., October 1997. 

1 8 . Control of Apo2L signaling by death and decoy receptors. American Association 
for the Advancement of Science. Philadelphia, PA, February 1998. 

19. Apo2 ligand and its receptors. American Society of Immunoiogists. San 
Francisco, CA, April 1998. 

20. Death receptors and ligands. 7th International TNF Congress. Cape Cod, MA, 
May 1998. 

21 . Apo2L as a potential therapeutic for cancer. UCLA School of Medicine. LA, 
. CA, June 1998. 

.22. Apo2L as a potential therapeutic for cancer. Gordon Research Conference on 

Cancer Chemotherapy. New London, NH, July 1998. . 
23 : Control of apoptosis by Apo2L. Endocrine Society Conference, Stevenson, WA, 

August 1998. 

24. Control of apoptosis by Apo2L. International Cytokine Society Conference, 
Jerusalem, Israel, October 1998. 



11 



25. Apoptosis control by death and decoy receptors. American Association for 
Cancer Research Conference, Whistler, BC, Canada, March 1 999. 

26. Apoptosis control by death and decoy receptors. American Society for 
Biochemistry and Molecular Biology Conference, San Francisco, CA, May 1999. 

27. Apoptosis control by death and decoy receptors. Gordon Research Conference on 
Apoptosis, New London, NH, June 1999. 

28. Apoptosis control by death and decoy receptors. Arthritis Foundation Research 
Conference, Alexandria GA, Aug 1999. 

29. Safety and anti-tumor activity of recombinant soluble Apo2L/TRAIL. Cold 
Spring Harbor Laboratory Symposium on Programmed Cell Death. . Cold Spring 
Harbor, NY, September 1999. 

30. The Apo2L/TRAJL system: therapeutic potential. American Association for 
Cancer Research, Lake Tahoe, NV, Feb 2000. 

3 1 . Apoptosis and cancer therapy. Stanford University School of Medicine, Stanford, 
CA, Mar 2000. 

32. Apoptosis and cancer therapy. University of Pennsylvania School of Medicine, 
Philadelphia, PA, Apr 2000. 

3 3 . Apoptosis signaling by Apo2L/TRAIL. International Congress on TNF. 
Trondheim, Norway, May 2000. 

34. The Apo2L/TRAIL system: therapeutic potential Cap-CURE summit meeting. 
Santa Monica, CA, June 2000.. . 

35. The Apo2L/TRAJL system: therapeutic potential. MD Anderson Cancer Center. 
Houston, TX, June 2000. 

36. Apoptosis signaling by Apo2I7TRAIL. The Protein Society, 14 th Symposium. 
San Diego, CA, August 2000. 

37. Anti-tumor activity of Apo2L/TRAJL. AAPS annual meeting. Indianapolis, IN 
Aug 2000. . 

3 8 . Apoptosis signaling and anti-cancer potential of Apo2L/TRAJL. Cancer Research 
Institute, UC San Francisco, CA, September 2000. 

39. Apoptosis signaling by Apo2L/TRAJL. Kenote address, TNF family 
Minisymposium, NIH. Bethesda, MD, September 2000. 

40. Death receptors: signaling and modulation. Keystone symposium on the 
Molecular basis of cancer. Taos, NM, Jan 2001 . 

41: Preclinical studies of Apo2L/TRAJL in cancer. Symposium on Targeted therapies 
in the treatment of lung cancer. Aspen, CO, Jan 2001. . 



12 



42. Apoptosis signaling by Apo2L/TRAIL. Wiezmann Institute of Science, Rehovot, 
Israel, March 2001. 

43. Apo2L/TRAIL: Apoptosis signaling and potential for cancer therapy. Weizmann 
Institute of Science, Rehovot, Israel, March 200 1 . 

44. Targeting death receptors in cancer with Apo2I/TRAJL. Cell Death and Disease 
conference, North Falmouth, MA, Jun 2001 . 

45. Targeting death receptors in cancer with Apo2L/TRAIL. Biotechnology 
Organization conference, San Diego, CA, Jun 2001. 

46. Apo2L/TRAJL signaling and apoptosis resistance mechanisms. Gordon Research 
Conference on Apoptosis, Oxford, UK, July 2001. 

47. Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Cleveland Clinic 
Foundation, Cleveland, OH, Oct 2001 . 

48 . Apoptosis signaling by death receptors: overview. International Society for 
Interferon and Cytokine Research conference, Cleveland, OH, Oct 2001. 

49 . ■ Apoptosis signaling by death receptors. American Society of Nephrology 

Conference. San Francisco, CA, Oct 2001. 

50. Targeting death receptors in cancer. Apoptosis: commercial opportunities. San 
Diego, CA, Apr 2002. 

51. . Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Kimmel Cancer 

Research Center, Johns Hopkins University, Baltimore MD. May 2002. 

52. Apoptosis control by Apo2L/TRADL (Keynote Address) University of Alabama 
Cancer Center Retreat, Birmingham, Ab. October 2002. 

53. ' Apoptosis signaling by Apo2L/TRAlL. (Session co-chair) TNF international. 

conference. San Diego, CA. October 2002. 

54. Apoptosis signaling by Apb2L/TRAIL. Swiss Institute for Cancer Research 
(ISREC). Lausanne, Swizerland. Jari 2003. 

5 5 . Apoptosis induction with Apo2L/TRAIL. Conference on New Targets and 
. Innovative Strategies in Cancer Treatment. Monte Carlo. February 2003. 

56. Apoptosis signaling by Apo2L/TRAIL. Hermelin Brain Tumor Center . 
Symposium on Apoptosis. Detroit, MI. April 2003. 

57. Targeting apoptosis through death receptors. Sixth Annual Conference on 
Targeted Therapies in the Treatment of Breast Cancer. Kona, Hawaii. July 2003. 

58. Targeting apoptosis through death receptors. Second International Conference on 
Targeted Cancer Therapy. Washington, DC. Aug 2003. 

Issued Patents: 



13 



1 . Ashkenazi, A., Chamow, S. and Kogan, T. Carbohydrate-directed crosslinking 
reagents. US patent 5,329,028 (Jul 12, 1994). 

2. Ashkenazi, A., Chamow, S. and Kogan, T. Carbohydrate-directed crosslinking 
reagents. US patent 5,605,791 (Feb 25, 1997). 

3 . Ashkenazi, A., Chamow, S. and Kogan,.T. Carbohydrate-directed crosslinking 
reagents. US patent 5,889,155 (Jul 27, 1999): 

4. Ashkenazi, A., APO-2 Ligand. US patent 6,030,945 (Feb 29, 2000). 

5. Ashkenazi, A., Chuntharapai, A., Kim, J., APQ-2 ligand antibodies. US patent 6, 
046, 048 (Apr 4, 2000). 

6. Ashkenazi, A.,. Chamow, S. and Kogan, T. Carbohydrate-directed crosslinking 
reagents. US patent. 6, 124,435 (Sep 26, 2000). 

7. Ashkenazi, A., Chuntharapai, A., Kim, J., Method for making monoclonal and cross- 
reactive antibodies. US patent 6,252,050 (Jun 26, 2001). , 

8. Ashkenazi, A. APO-2 Receptor. US patent 6,342;369 (Jan 29, 2002). 

9. Ashkenazi, A. Fong, S., Goddard, A., Gurney, A., Napier, M., Tumas, D., Wood, W. 
A-33 polypeptides. US patent 6,410,708 (Jun 25, 2002). 

1 0. Ashkenazi, A. APO-3 Receptor. US patent 6,462, 1 76 B 1 (Oct 8, 2002). 

11. Ashkenazi, A^ APO-2LI and APO-3 polypeptide antibodies. US patent 6,469,144 Bl 
(Oct 22, 2002). 

12. Ashkenazi, A., Chamow, S. and Kogan, T. Carbohydrate-directed crosslinking 
reagents. US patent 6,582,928B1 (Jun 24, 2003). 



14 



DECLARATION OF PAUL POLAKIS, Ph.U 



I, Paul Polakis, Ph.D., declare and say as follows: 

1. I was awarded a Ph.D. by the Department of Biochemistry of the Michigan 
State University in 1984. My scientific Curriculum Vitae is attached to and forms 
part of this Declaration (Exhibit A). 

2. I am currently employed by Genentech, Inc. where my job title is Staff 
Scientist. Since joining Genentech in 1999, one of my primary responsibilities has 
been leading Genentech f s Tumor Antigen Project, which is a large research project 
with a primary focus on identifying tumor cell markers that find use as targets for 
both the diagnosis and treatment of cancer in humans. 

3. As part of the Tumor Antigen Project, my laboratory has been analyzing 
differential expression of various genes in tumor cells relative to normal cells. 
The purpose of this research is to identify proteins that are abundantly expressed 
on certain tumor ceils and that are either (i) not expressed, or (ii) expressed at 
lower levels, on corresponding normal cells. We call such differentially expressed 
proteins "tumor antigen proteins". When such a tumor antigen protein is 
identified, one can produce an antibody that recognizes and binds to that protein. 
Such an antibody finds use in the diagnosis of human cancer and may ultimately 
serve as an effective therapeutic in the treatment of human cancer. 

4. In the course of the research conducted by Genentech's Tumor Antigen 
Project, we have employed a variety of scientific techniques for detecting and 
studying differential gene expression in human tumor cells relative to normal cells, 
at genomic DNA, mRNA and protein levels. An important example of one such 
technique is the well known and widely used technique of microarray analysis 
which has proven to be extremely useful for the identification of mRNA molecules 
that are differentially expressed in one tissue or cell type relative to another. In the 
course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor cells at 
significantly higher levels than in corresponding normal human cells. To date, we 
have generated antibodies that bind to about 30 of the tumor antigen proteins 
expressed from these differentially expressed gene transcripts and have used these 
antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and coiresponding normal cells. We 
have then compared the levels of mRNA and protein in both the tumor and normal 
cells analyzed. 

5. From the mRNA and protpin expression analyses described in paragraph 4 
above, we have observed that there is a strong correlation between changes in the 
level of mRNA present in any particular cell type and the level of protein 



expressed from that mRNA in that cell type. In approximately 80% of our 
observations we have found that increases in the level of a particular mRNA 
correlates with changes in the level of protein expressed from that mRNA when 
human tumor cells are compared with their corresponding normal cells. 

6. Based upon my own experience accumulated in more than 20 years of 
research, including the data discussed in paragraphs 4 and 5 above and my 
knowledge of the relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of mRNA in a tumor cell relative 
to a normal cell typically correlates to a similar increase in abundance of the 
encoded protein in the tumor cell relative to the normal cell. In fact, it remains a 
central dogma in molecular biology that increased mRNA levels are predictive of 
corresponding increased levels of the encoded protein. While there have been 
published reports of genes for which such a correlation does not exist, it is my 
opinion that such reports are exceptions to the commonly understood general rule 
that increased mRNA levels are predictive of corresponding increased levels of the 
encoded protein. 

7. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information or belief are believed to be true, 
and further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, 
under Section 1 001 of Title 1 8 of the United States Code and that such willful 
statements may jeopardize the validity of the application or any patent issued 
thereon. 



Dated: $/o 7/e>/ 




Paul Polakis, Ph.D. 



SV 2031808 vl 



(CANCER RESEARCH 62.6240-6245, November 1, 2002J 

Impact of DNA Amplification on Gene Expression Patterns in Breast Cancer 3 ' 2 

Elizabeth Hyman, 3 Paivikki Kauraniemi, 3 Sampsa Hautaniemi, Maija Wolf, Spyro Mousses, Ester Rozenblmn, 
Markus Ringner, Guido Sauter, Outi Monni, Abdel Elkahloun, Olli-P. KaUioniemi, and Anne Kaflioniemi 4 

Howard Hitghes Medical Institute-NJH Research Scholar, Bethesda. Maryland 20892 (E. H.J; Cancer Generics Branch, National Human Genome Research institute, NIH 
Bethesda. Maryland 20892 [E. H.. P. K.. S. H. M. W„ J. M.. K &. M. R„ A. £, O. tL, A. K)l Laboratory of Cancer Genetics, Institute of Medical Technology, University of 
Tampere and Tampere University Hospital, FlN-33520 Tampere, Finland [P.K., A. KJ; Signal Processing Laboratory, Tampere University of Technology. FIN-33J0J Tampere 
Finland [S. HJ; Institute of Pathology. University of Basel CH-4003 Basel, Switzerland [GS.J; and Biomedfcum Biochip Center, Helsinki University Hospital Biomedicum 
Helsinki. FIH-O00N Helsinki. Finland [O. M.J P ' 



ABSTRACT 

Genetic changes underlie tumor progression and may lead to cancer* 
specific expression of critical genes. Over 1100 publications have de- 
scribed the use of comparative genomic hybridization (CGH) to analyze 
the pattern of copy number alterations in cancer, but very few of the genes 
affected are known. Here, wc performed high-resolution CGH analysis on 
cDNA micro arrays in breast cancer and directly compared copy number 
and idRNA expression levels of 13,824 genes to quantitate the impact of 
genomic changes on gene expression. We identified and mapped the 
boundaries of 24 independent amplicons, ranging in size from 0.2 to 12 
Mb. Throughout the genome, both high- and low-level copy number 
changes had a substantial impact on gene expression, with 44% of the 
highly amplified genes showing overexprcssion and 105% of the highly 
overcxpressed genes being amplified. Statistical analysis with random 
permutation tests identified 270 genes whose expression levels across 14 
samples were systematically attributable to gene amplification. These 
included most previously described amplified genes in breast cancer and 
many novel targets for genomic alterations, including the HOXB7 gene, 
the presence of which in a novel amplicon at 17q213 was validated in 
10.2% of primary breast cancers and associated with poor patient prog- 
nosis. In conclusion, CGH on cDNA microarrays revealed hundreds of 
novel genes whose overexpression is attributable to gene amplification. 
These genes may provide insights to the clonal evolution and progression 
of breast cancer and highlight promising therapeutic targets. 

INTRODUCTION 

Gene expression patterns revealed by cDNA microarrays have 
facilitated classification of cancers into biologically distinct catego- 
ries, some of which may explain the clinical behavior of the tumors 
(1-6). Despite this progress in diagnostic classification, the molecular 
mechanisms underlying gene expression patterns in cancer have re- 
mained elusive, and the utility of gene expression profiling in the 
identification of specific therapeutic targets remains limited. 

Accumulation of genetic defects is thought to underlie the clonal 
evolution of cancer. Identification of the genes that mediate the effects 
of genetic changes may be important by highlighting transcripts that 
are actively involved in tumor progression. Such transcripts and their 
encoded proteins would be ideal targets for anticancer therapies, as 
demonstrated by the clinical success of new therapies against ampli- 
fied oncogenes, such as ERBB2 and EGFR (7, 8), in breast cancer and 
other solid tumors. Besides amplifications of known oncogenes, over 
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Copy number ratio 




Expression ratio 

Fig. 1. Impact of gene copy number on global gene expression levels. A. percentage of 
over- and undercxprcsscd genes (Y axis) according to copy number ratios {X axis). 
Threshold values used for over- and undercxpression were >2.184 (global upper 7% of 
the cDNA ratios) and <0.4826 (global lower 7% of the expression ratios). B, percentage 
of amplified and deleted genes according to expression ratios. Threshold values for 
amplification and deletion were > 1.5 and <0.7. 



20 recurrent regions of DNA amplification have been mapped in 
breast cancer by CGH 5 (9, 10). However, these amplicons are often 
large and poorly defined, and their impact on gene expression remains 
unknown. 

We hypothesized that genome-wide identification of those gene 
expression changes that are attributable to underlying gene copy 
number alterations would highlight transcripts that are actively in- 
volved in the causation or maintenance of the malignant phenotype. 
To identify such transcripts, we applied a combination of cDNA and 
CGH microarrays to: (a) determine the global impact that gene copy 
number variation plays in breast cancer development and progression; 
and (6) identify and characterize those genes whose mRNA expres- 



3 The abbreviations used are: CGH, comparative genomic hybridization; FISH, fluo- 
rescence in siiu hybridization; RT-PCR, reverse txanscrrpuon-PCR. 
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GENE EXPRESSION PATTERNS IN BREAST CANCER 




Fig. 2. Genome-wide copy number and expression analysis in the MCF-7 breast canccT cell line. A> chromosomal CGH analysis of MCF-7. The copy number ratio profile {blue 
tine) across the entire genome from lp telomere to Xq telomere is shown along with ± 1 SD (orange lines). The black horizontal line indicates a ratio of J .0; red line, a ratio of 0.8; 
and green line, a ratio of \2. B-C, genome-wide copy number analysis in MCF-7 by CGH on cDNA microarray. The copy number ratios were plotted as a function of the position 
of the cDMA clones along the human genome. In B> individual data points are connected with a line, and a moving median of 10 adjacent clones is shown. Red horizontal line, the 
copy number ratio of 1 .0. In C. individual data points are labeled by color coding according to cDN A expression ratios. The bright red dots indicate the upper 2%, and dark red dots, 
the next 5°A> of the expression ratios in MCF-7 cells (pvercxpressed genes); bright green dots indicate the lowest 2%, and dark green dots, the next 5% of the expression ratios 
(undercxpressed genes); the rest of the observations arc shown with Hack crosses. The chromosome numbers are shown at the bottom of the figure, and chromosome boundaries are 
indicated with a dashed fine. 



si on is most significantly associated with amplification of the corre- 
sponding genomic template. 

MATERIALS AND METHODS 

Breast Cancer Cell Lines. Fourteen breast cancer cell lines (BT-2Q, BT- 
474, HCC1428, Hs578t, MCF7, MDA-361, MDA-436, MDA-453, MDA-468, 
SKBR-3, T-47D, U ACQS 12, ZR-75-1, and ZR-75-30) were obtained from the 
American Type Culture Collection (Manassas, VA). Cells were grown under 
recommended culture conditions. Genomic DNA and mRNA were isolated 
using standard protocols. 

Copy Number and Expression Analyses by cDNA Microarrays. The 
preparation and printing of the 13,824 cDNA clones on glass slides were 
performed as described (11-13). Of these clones, 244 represented uncharac- 
terized expressed sequence tags, and the remainder corresponded to known 
genes. CGH experiments on cDNA microarrays were done as described (14, 
15). Briefly, 20 of genomic DNA from breast cancer cell lines and normal 
human WBCs were digested for 14— 18 h with Alul and Rsal (Life Technol- 
ogies, Inc., Rockville, MD) and purified by phenol/chloroform extraction. Six 
pjg of digested cell line DNAs were labeled with Cy3-dUTP (Amersham 
Pharmacia) and normal DNA with CyS-dUTP (Amersham Pharmacia) using 
the Bioprime Labeling kit (Life Technologies, Inc.). Hybridization (14, 15) and 
posthybridizaiion washes (13) were done as described. For the expression 
analyses, a standard reference (Universal Human Reference RNA; Stratagene, 
La Jolla,' CA) was used in all experiments. Forty u.g of reference RNA were 
labeled with Cy3-dUTP and 3.5 p,g of test mRNA with Cy5-dUTP, and the 
labeled cDNAs were hybridized on microarrays as described (1 3, 1 5). For both 
microarray analyses, a laser confocal scanner (Agilent Technologies, Palo 
Alto, CA) was used to measure the fluorescence intensities at the target 
locations using the DE ARRAY software (16). After background subtraction, 
average intensities at each clone in the test hybridization were divided by the 
average intensity of the corresponding clone in the control hybridization. For 
the copy number analysis, the ratios were normalized on the basis of the 
distribution of ratios of all targets on the array and for the expression analysis 
on the basis of 88 housekeeping genes, which were spotted four times onto the 
array. Low quality measurements (i.e., copy number data with mean reference 
intensity <100 fluorescent units, and expression data with both lest and 
reference intensity <100 fluorescent units and/or with spot size <50 units) 
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were excluded from the analysis and were treated as missing values. The 
distributions of fluorescence ratios were used to define cutpoints for increased/ 
decreased copy number. Genes with CGH ratio >1.43 (representing the upper 
5% of the CGH ratios across all experiments) were considered to be amplified, 
and genes with ratio <0.73 (representing the lower 5%) were considered to be 
deleted. 

Statistical Analysis of CGH and cDNA Microarray Data. To evaluate 
the influence of copy number alterations on gene expression, we applied the 
following statistical approach. CGH and cDNA calibrated intensity ratios were 
log-transformed and normalized using median centering of the values in each 
cell line. Furthermore, cDNA ratios for each gene across all 14 cell lines were 
median centered. For each gene, the CGH data were represented by a vector 
that was labeled 1 for amplification (ratio, >1.43) and 0 for no amplification. 
Amplification was correlated with gene expression using the signat-to-noise 
statistics (1). We calculated a weight, w r for each gene as follows: 

where m gl , cr sl and o^ denote the means and SDs for the expression 
levels for amplified and oonamplified cell lines, respectively, To assess the 
statistical significance of each weight, we performed 10,000 random permu- 
tations of the label vector. The probability that a gene had a larger or equal 
weight by random permutation than the original weight was denoted by a. A 
low a (<0.05) indicates a strong association between gene expression and 
amplification. 

Genomic Localization of cDNA Clones and Amplicon Mapping. Each 
cDNA clone on the microarray was assigned to a Unigene cluster using the 
Unigene Build 14 1. 6 A database of genomic sequence alignment information 
for mRNA sequences was created from the August 2001 freeze of the Uni- 
versity of California Santa Cruz's GoldenPath database. 7 The chromosome and 
bp positions for each cDNA clone were then retrieved by relating these data 
sets. Amplicons were defined as a CGH copy number ratio >2.0 in at least two 
adjacent clones in two or more cell lines or a CGH ratio >2.0 in at least three 
adjacent clones in a single cell line. The amplicon start and end positions were 



6 Internet address: http://re$earchjihgn\nih.gov/iw 

7 Internet address: www.gcnomcucsc.edu. 
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Table 1 Summary of independent ampi icons in 14 breast cancer celt lines by 
CGH microarray 



Location 




End (Mb) 


Size (Mb) 


lpI3 


132.79 


132.94 


0.2 


Jq21 


173.92 


177.25 


3.3 


lq22 


179.28 


179.57 


0.3 


3p|4 


71.94 


74.66 


2.7 


7pJ2.l-7pll.2 


55.62 


60.95 


5.3 


7q3l 


125.73 


130.96 


5.2 


7q32 


140.01 


140.68 


0.7 


8q21.1 l-8q21.13 


86.45 


92.46 


6.0 


8q21.3 


98.45 


103.05 


4.6 


8q23.3-«q24J4 


129.88 


142.15 


12.3 


8q24.22 


151.21 


15116 


K0 


9pJ3 


38.65 


3925 


0.6 


13q22— q3i 


77.15 


8138 


4.2 


16q22 


86.70 


87.62 


0.9 


17qll 


29.30 


30.85 


1.6 


17ql2— q21.2 


39.79 


42.80 


3.0 


17q2l.32-q2l.33 


52.47 


55.80 


3.3 


I7q22-q23.3 


63.81 


69.70 


5.9 


17q23.3-q24.3 


69.93 


74.99 


5.1 


I9q13 


40.63 


41.40 


0.8 


20q 11.22 


34.59 


35.85 


1.3 


20q]3.J2 


44.00 


45.62 


1.6 


20ql3.12-ql3.13 


46.45 


49.43 


3.0 


20ql3.2-ql3.32 


51.32 


59.12 


7.8 



CGH were validated, with lq2i, 17ql2-q21.2, 17q22-q23, 20ql3.1, 
and 20ql3.2 regions being most commonly amplified. Furthermore, 
the boundaries of these amplieons were precisely delineated. In ad- 
dition, novel amplieons were identified at 9pl3 (38.65-39.25 Mb), 
and 17q2 1.3 (52.47-55.80 Mb). 

Direct Identification of Putative Amplification Target Genes. 
The cDNA/CGH microarray technique enables the direct correla- 
tion of copy number and expression data on a gene-by-gene basis 
throughout the genome. We directly annotated high-resolution 
CGH plots with gene expression data using color coding. Fig. 2C 
shows that most of the amplified genes in the MCF-7 breast cancer 
cell line at lpl3, )7q22-q23, and 20ql3 were highly overex- . 
pressed. A view of chromosome 7 in the MDA-468 cell line 
implicates EGFR as the most highly overexpressed and amplified 
gene at 7pl l-pl2 (Fig. 3^). In BT-474, the two known amplieons 
at 17ql2 and 17q22-q23 contained numerous highly overex- 
pressed genes (Fig. 35). In addition, several genes, including the 
homeobox genes HOXB2 and HOXB7, were highly amplified in a 
previously undescribed independent amplicon at I7q21.3. HOXB7 
was systematically amplified (as validated by FISH, Fig. 3B. inset) 
as well as overexpressed (as verified by RT-PCR, data not shown) 
in BT-474, UACC812, and ZR-75-30 cells. Furthermore, this novel 



extended to include neighboring nonamplificd clones (ratio, < 1 .5). The am- 
plicon size determination was partially dependent on local clone density. 

FISH. Dual-color interphase FISH to breast cancer cell lines was done as 
described (17). Bacterial artificial chromosome clone RP11-361K8 was la- 
beled with SpectrumOrange (Vysis, Downers Grove. IL), and Spcctnim- 
Orange-labeled probe for EGFR was obtained from Vysis. SpectrumGreen- 
labeled chromosome 7 and 17 centromere probes (Vysis) were used as a 
reference. A tissue microarray containing 612 formalin-fixed, paraffin-embed- 
ded primary breast cancers (17) was applied in FISH analyses as described 
(18). The use of these specimens was approved by the Ethics Committee of the 
University of Basel and by the NIH. Specimens containing a 2-fold or higher 
increase in the number of test probe signals, as compared with corresponding 
centromere signals, in at least 1 0% of the tumor cells were considered to be 
amplified. Survival analysis was performed using the Kaplan-Meier method 
and the log-rank test. 

RT-PCR. The HOXB7 expression level was determined relative to 
GAPDH. Reverse transcription and PCR amplification were performed using 
Access RT-PCR System (Promega Corp., Madison, WI) with 10 ng of mRNA 
as a template, HOXB7 primers were 5 '-G AGCAGAGGG ACTCGG ACTT-3 ' 
and 5 ' -GCGTC AGGTAGCG ATTGTAG-3 

RESULTS 

Global Effect of Copy Number on Gene Expression. 13,824 
arrayed cDNA clones were applied for analysis of gene expression 
and gene copy number (CGH microarrays) in 14 breast cancer cell 
lines. The results illustrate a considerable influence of copy number 
on gene expression patterns. Up to 44% of the highly amplified 
transcripts (CGH ratio, >2.5) were overexpressed (ue. t belonged to 
the global upper 7% of expression ratios), compared with only 6% for 
genes with normal copy number levels (Fig. 1 A). Conversely, 10.5% 
of the transcripts with high-level expression (cDNA ratio, >10) 
showed increased copy number (Fig. \B). Low-level copy number 
increases and decreases were also associated with similar, although 
Jess dramatic, outcomes on gene expression (Fig. 1). 

Identification of Distinct Breast Cancer Amplieons. Base-pair 
locations obtained for 1 1 ,994 cDNAs (86.8%) were used to plot copy 
number changes as a function of genomic position (Fig. 2, Supple- 
ment Fig. A). The average spacing of clones throughout the genome 
was 267 kb. This high-resohition mapping identified 24 independent 
breast cancer amplieons, spanning from 0.2 to 12 Mb of DNA (Table 
I). Several amplification sites detected previously by chromosomal 
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Fig. 3. Annotation of gene expression data on CGH microarray profiles. A, genes in the 
7pl l-p!2 amplicon in the MDA-468 ceil line arc highly expressed (red dots) and include 
the EGFR oncogene. B, several genes in the l7qlZ, 17q21.3, and 17q23 amplieons in the 
BT-474 breast cancer cell line are highly overexpressed (red) and include the HOXB7 
gene. The data labels and color coding are as indicated for Fig. 2C Insets show 
chromosomal CGH profiles for the corresponding chromosomes and validation of the 
increased copy number by interphase FISH using EGFR (red) and chromosome 7 
centromere probe (green) to MDA-468 (A) and HOXB7-speci(\c probe {red) and chro- 
mosome 17 centromere (green) to BT-474 cells (£)* 
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Copy number nOo 



ExpmsJon ratio 



Fig. 4. List of 50 genes with a statistically 
significant correlation (a value <0.05) between 
gene copy number and gene expression. Name, 
chromosomal location, and the a value for each 
gene are indicated. The genes have been ordered 
according to their position in the genome. The color 
maps on the right illustrate the copy number and 
expression ratio patterns in the 14 cell tines. Hie 
key to the color code is shown at the bottom of the 
graph. Gray squares, missing values. The complete 
list of 270 genes is shown in supplemental Fig. B. 
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amplification was validated to be present in 10.2% of 363 primary 
breast cancers by FISH to a tissue microarray and was associated 
with poor prognosis of the patients (P = 0.001). 

Statistical Identifi cation and Characterization of 270 Highly 
Expressed Genes in Amplicons. Statistical comparison of expres- 
sion levels of all genes as a function of gene amplification identified 
270 genes whose expression was significantly influenced by copy 
number across all 14 cell lines (Fig. 4, Supplemental Fig. B). Accord- 
ing to the gene ontology data, 8 91 of the 270 genes represented 
hypothetical proteins or genes with no functional annotation, whereas 
179 had associated functional information available. Of these, 151 
(84%) are implicated in apoptosis, cell proliferation, signal transduc- 
tion, and transcription, whereas 28 (16%) had functional annotations 
that could not be directly linked with cancer. 



DISCUSSION 

The importance of recurrent gene and chromosome copy number 
changes in the development and progression of solid tumors has been 
characterized in >1000 publications applying CGH 9 (9, 10), as well 
as in a large number of other molecular cytogenetic, cytogenetic, and 
molecular genetic studies. The effects of these somatic genetic 
changes on gene expression levels have remained largely unknown, 
although a few studies have explored gene expression changes occur- 
ring in specific amplicons (15, 19—21). Here, we applied genome- 
wide cDNA micro arrays to identify transcripts whose expression 
changes were attributable to underlying gene copy number alterations 
in breast cancer. 

The overall impact of copy number on gene expression patterns was 
substantial with the most dramatic effects seen in the case of high- 



Internet address: http://www.gcneontology.org/. 
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level copy number increase. Low-level copy number gains and losses 
also had a significant influence on expression levels of genes in the 
regions affected, but these effects were more subtle on a gene-by-gene 
basis than those of high-level amplifications. However, the impact of 
low-level gains on the dysregulation of gene expression patterns in 
cancer may be equally important if not more important than that of 
high-level amplifications. Aneuploidy and low-level gains and losses 
of chromosomal arms represent the most common types of genetic 
alterations in breast and other cancers and, therefore, have an influ- 
ence on many genes. Our results in breast cancer extend the recent 
studies on the impact of aneuploidy on global gene expression pat- 
terns in yeast cells, acute myeloid leukemia, and a prostate cancer 
model system (22-24). 

The CGH roicroarray analysis identified 24 independent breast 
cancer amplicons. We defined the precise boundaries for many am- 
plicons detected previously by chromosomal CGH (9, 10, 25, 26) and 
also discovered novel amplicons that had not been detected previ- 
ously, presumably because of their small size (only 1-2 Mb) or close 
proximity to other larger amplicons. One of these novel amplicons 
involved the homeobox gene region at 17q2 J .3 and led to the over- 
expression of the HOXB7 and HOXB2 genes. The homeodomain 
transcription factors are known to be key regulators of embryonic 
development and have been occasionally reported to undergo aberrant 
expression in cancer (27, 28). HOXB7 transfection induced cell pro- 
liferation in melanoma, breast, and ovarian cancer cells and increased 
tumorigenicity and angiogenesis in breast cancer (29-32). The pres- 
ent results imply that gene amplification may be a prominent mech- 
anism for overexpressing HOXB7 in breast cancer and suggest that 
HOXB7 contributes to tumor progression and confers an aggressive 
disease phenotype in breast cancer. This view is supported by our 
finding of amplification of HOXB7 in 10% of 363 primary breast 
cancers, as well as an association of amplification with poor prognosis 
of the patients. 

We carried out a systematic search to identify genes whose 
expression levels across all 14 cell lines were attributable to 
amplification status. Statistical analysis revealed 270 such genes 
(representing —2% of all genes on the array), including not only 
previously described amplified genes, such as HER~2 : MYC, 
EGFR, ribosomal protein s6 kinase, and AIB3 y but also numerous 
novel genes such as NRAS-related gene (lpl3), syndecan-2 (8q22), 
and bone morphogenic protein (20ql3.1), whose activation by 
amplification may similarly promote breast cancer progression. 
Most of the 270 genes have not been implicated previously in 
breast cancer development and suggest novel pathogenetic mech- 
anisms. Although we would not expect all of them to be causally 
involved, it is intriguing that 84% of the genes with associated 
functional information were implicated in apoptosis, cell prolifer- 
ation, signal transduction, transcription, or other cellular processes 
that could directly imply a possible role in cancer progression. 
Therefore, a detailed characterization of these genes may provide 
biological insights to breast cancer progression and might lead to 
the development of novel therapeutic strategies. 

In summary, we. demonstrate application of cDNA microarrays 
to the analysis of both copy number and expression levels of over 
12,000 transcripts throughout the breast cancer genome, roughly 
once every 267 kb. This analysis provided: (a) evidence of a 
prominent global influence of copy number changes on gene 
expression levels; (b) a high-resolution map of 24 independent 
amplicons in breast cancer; and (c) identification of a set of 270 
genes, the overexpression of which was statistically attributable to 
gene amplification. Characterization of a novel amplicon at 
17q2K3 implicated amplification and overexpression of the 
HOXB7 gene in breast cancer, including a clinical association 



between HOXB 7 amplification and poor patient prognosis. Overall, 
our results illustrate how the identification of genes activated by 
gene amplification provides a powerful approach to highlight 
genes with an important role in cancer as well as to prioritize and 
validate putative targets for therapy development. 
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Genomic DNA copy number alterations are key genetic events in 
the development and progression of human cancers. Here we 
report a genome-wide microarray comparative genomic hybrid- 
isation (array CGH) analysis of DNA copy number variation in 
a series of primary human breast tumors. We have profiled DNA 
copy number alteration across 6,691 mapped human genes, in 44 
predominantly advanced, primary breast tumors and 10 breast 
cancer cell lines. While the overall patterns of DNA amplification 
and deletion corroborate previous cytogenetic studies, the high- 
resolution (gene-by-gene) mapping of amplicon boundaries and 
the quantitative analysis of amplicon shape provide significant 
improvement in the localization of candidate oncogenes. Parallel 
microarray measurements of mRNA levels reveal the remarkable 
degree to which variation in gene copy number contributes to 
variation in gene expression in tumor cells. Specifically, we find 
that 62% of highly amplified genes show moderately or highly 
elevated expression, that DNA copy number influences gene ex- 
pression across a wide range of DNA copy number alterations 
(deletion, low-, mid- and high-level amplification), that on average, 
a 2-fold change in DNA copy number is assoriated with a corre- 
sponding 1.5-fold change in mRNA levels, and that overall, at least 
12% of all the variation in gene expression among the breast 
tumors is directly attributable to underlying variation in gene copy 
number. These findings provide evidence that widespread DNA 
copy number alteration can lead directly to global deregulation of 
gene expression, which may contribute to the development or 
progression of cancer. 

Conventional cytogenetic techniques, including comparative 
genomic hybridization (CGH) (1), have led to the identifi- 
cation of a number of recurrent regions of DNA copy number 
alteration in breast cancer cell lines and tumors (2-4). While 
some of these regions contain known or candidate oncogenes 
[e.g., FGFR1 (8pll), MYC (8q24), CCND1 (llql3), ERBB2 
(17ql2), and ZNF217 (20ql3)] and tumor suppressor genes 
[RBI (13ql4) and TP53 (17pl3)], the relevant gene(s) within 
other regions (e.g., gain of lq, 8q22 t and 17q22-24, and loss of 
8p) remain to be identified. A high-resolution genome-wide 
map, delineating the boundaries of DNA copy number alter- 
ations in tumors, should facilitate the localization and identifi- 
cation of oncogenes and tumor suppressor genes in breast 
cancer. In this study, we have created such a map, using 
array-based CGH (5-7) to profile DNA copy number alteration 
in a series of breast cancer cell lines and primary tumors. 

An unresolved question is the extent to which the widespread 
DNA copy number changes that we and others have identified 
in breast tumors alter expression of genes within involved 
regions. Because we had measured mRNA levels in parallel in 
the same samples (8), using the same DNA microarrays, we had 
an opportunity to explore on a genomic scale the relationship 
between DNA copy number changes and gene expression. From 
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this analysis, we have identified a significant impact of wide- 
spread DNA copy number alteration on the transcriptional 
programs of breast tumors. 

Materials and Methods 

Tumors and Cell Lines. Primary breast tumors were predominantly 
large (>3 cm), intermediate-grade, infiltrating ductal carcino- 
mas, with more than 50% being lymph node positive. The 
fraction of tumor cells within specimens averaged at least 50%. 
Details of individual tumors have been published (8, 9), and 
are summarized in Table 1, which is published as supporting 
information on the PNAS web site, www.pnas.org. Breast cancer 
cell lines were obtained from the American Type Culture 
Collection. Genomic DNA was isolated either using Qiagen 
genomic DNA columns, or by phenol/chloroform extraction 
followed by ethanol precipitation. 

DNA Labeling and Microarray Hybridizations. Genomic DNA label- 
ing and hybridizations were performed essentially as described 
in Pollack et al. (7), with slight modifications. Two micrograms 
of DNA was labeled in a total volume of 50 microliters and the 
volumes of all reagents were adjusted accordingly. "Test" DNA 
(from tumors and cell lines) was f luorescently labeled (Cy5) and 
hybridized to a human cDNA microarray containing 6,691 
different mapped human genes (i.e., UniGene clusters). The 
"reference" (labeled with Cy3) for each hybridization was nor- 
mal female leukocyte DNA from a single donor. The fabrication 
of cDNA microarrays and the labeling and hybridization of 
mRNA samples have been described (8). 

Data Analysis and Map Positions. Hybridized arrays were scanned 
on a GenePix scanner (Axon Instruments, Foster City, CA), and 
fluorescence ratios (test/reference) calculated using scanalyze 
software (available at http://rana.lbl.gov). Fluorescence ratios 
were normalized for each array by setting the average log 
fluorescence ratio for all array elements equal to 0. Measure- 
ments with fluorescence intensities more than 20% above back- 
ground were considered reliable. DNA copy number profiles 
that deviated significantly from background ratios measured in 
normal genomic DNA control hybridizations were interpreted as 
evidence of real DNA copy number alteration (see Estimating 
Significance of Altered Fluorescence Ratios in the supporting 
information). When indicated, DNA copy number profiles are 
displayed as a moving average (symmetric 5-nearest neighbors). 
Map positions for arrayed human cDNAs were assigned by 
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Fig. 1. Genome-wide measurement of DN A copy number alteration by array CGH. (a) DNA copy number profiles are illustrated for cell lines containing different 
numbers of X chromosomes, for breast cancer cell lines, and for breast tumors. Each row represents a different cell line or tumor, and each column represents 
one of 6,691 different mapped human genes present on the microarray, ordered by genome map position from ipterthrough Xqter. Moving average (symmetric 
5-nearest neighbors) fluorescence ratios (test/reference) are depicted using a log2-based pseudocolor scale (indicated), such that red luminescence reflects 
fold -amplification, green luminescence reflects fold-deletion, and black indicates no change (gray indicates poorly measured data). (6) Enlarged view of DNA 
copy number profiles across the X chromosome, shown for cell lines containing different numbers of X chromosomes. 



identifying the starting position of the best and longest match of 
any DNA sequence represented in the corresponding UniGene 
cluster (10) against the "Golden Path" genome assembly 
(http://genome.ucsc.edu/; Oct 7, 2000 Freeze). For UniGene 
clusters represented by multiple arrayed elements, mean fluo- 
rescence ratios (for all elements representing the same UniGene 
cluster) are reported. For mRNA measurements, fluorescence 
ratios are "mean-centered" (i.e., reported relative to the mean 
ratio across the 44 tumor samples). The data set described here 
can be accessed in its entirety in the supporting information. 

Results 

We performed CGH on 44 predominantly locally advanced, 
primary breast tumors and 10 breast cancer cell lines, using 
cDNA microarrays containing 6,691 different mapped human 
genes (Fig. la; also see Materials and Methods for details of 
microarray hybridizations). To take full advantage of the im- 
proved spatial resolution of array CGH, we ordered (fluores- 
cence ratios for) the 6,691 cDNAs according to the "Golden 
Path" (http://genome.ucsc.edu/) genome assembly of the draft 
human genome sequences (11). In so doing, arrayed cDNAs not 
only themselves represent genes of potential interest (e.g., 
candidate oncogenes within amplicons), but also provide precise 
genetic landmarks for chromosomal regions of amplification and 



deletion. Parallel analysis of DNA from cell lines containing 
different numbers of X chromosomes (Fig. lb), as we did before 
(7), demonstrated the sensitivity of our method to detect single- 
copy loss (45, XO), and 1.5- (47,XXX), 2- (48,XXXX), or 
2.5-fold (49,XXXXX) gains (also see Fig. 5, which is published 
as supporting information on the PNAS web site). Fluorescence 
ratios were linearly proportional to copy number ratios, which 
were slightly underestimated, in agreement with previous ob- 
servations (7). Numerous DNA copy number alterations were 
evident in both the breast cancer cell lines and primary tumors 
(Fig. la), detected in the tumors despite the presence of euploid 
non-tumor cell types; the magnitudes of the observed changes 
were generally lower in the tumor samples. DNA copy-number 
alterations were found in every cancer cell line and tumor, and 
on every human chromosome in at least one sample. Recurrent 
regions of DNA copy number gain and loss were readily iden- 
tifiable. For example, gains within Iq, 8q, 17q, and 20q were 
observed in a high proportion of breast cancer cell lines/tumors 
(90%/69%, 100%/47%, 100%/60%, and 90%/44%, respective, 
ly), as were losses within lp, 3p, 8p, and 13q (80%/24%, 
80%/22%, SQ%/22%, and 70%/18%, respectively), consistent 
with published cytogenetic studies (refs. 2-4; a complete listing 
of gains/losses is provided in Tables 2 and 3, which are published 
as supporting information on the PNAS web site). The total 
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Fig. 2. DNA copy number alteration across chromosome 8 by array CGH. (a) DN A copy number profiles are illustrated for eel! lines containing different numbers 
of X chromosomes, for breast cancer cell lines, and for breast tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering to 
highlight recurrent copy number changes. The 241 genes present on the microarrays and mapping to chromosome 8 are ordered by position along the 
chromosome. Fluorescence ratios (test/reference) are depicted by a log* pseudocolor scale (indicated). Selected genes are indicated with color-coded text (red. 
Increased; green, decreased; black, no change; gray, not well measured) to reflect correspondingly altered mRNA levels (observed in the majority of the subset 
of samples displaying the DNA copy number change). The map positions for genes of interest that are not represented on the microarray are indicated in the 
row above those genes represented on the array, (b) Graphical display of DNA copy number profile for breast cancer cell line 5KBR3. Fluorescence ratios 
(tumor/normal) are plotted on a log2 scale for chromosome 8 genes, ordered along the chromosome. 



number of genomic alterations (gains and losses) was found to 
be significantly higher in breast tumors that were high grade (P = 
0.008), consistent with published CGH data (3), estrogen recep- 
tor negative (P = 0.04), and harboring TP53 mutations (P - 
0.0006) (sec Table 4, which is published as supporting informa- 
tion on the PNAS web site). 

The improved spatial resolution of our array CGH analysis is 
illustrated for chromosome 8, which displayed extensive DNA 
copy number alteration in our series. A detailed view of the 
variation in the copy number of 241 genes mapping to chromo- 
some 8 revealed multiple regions of recurrent amplification; 
each of these potentially harbors a different known or previously 
uncharacterized oncogene (Fig. 2a). The complexity of amplicon 
structure is most easily appreciated in the breast cancer cell line 
SKBR3. Although a conventional CGH analysis of 8q in SKBR3 
identified only two distinct regions of amplification (12), we 
observed three distinct regions of high-level amplification (la- 
beled 1-3 in Fig. 2b). For each of these regions we can define the 



boundaries of the interval recurrently amplified in the tumors we 
examined; in each case, known or plausible candidate oncogenes 
can be identified (a description of these regions, as well as the 
recurrently amplified regions on chromosomes 17 and 20, can be 
found in Figs. 6 and 7, which are published as supporting 
information on the PNAS web site). 

For a subset of breast cancer cell lines and tumors (4 and 37, 
respectively), and a subset of arrayed genes (6,095), mRNA 
levels were quantitatively measured in parallel by using cDNA 
microarrays (8). The parallel assessment of mRNA levels is 
useful in the interpretation of DNA copy number changes. For 
example, the highly amplified genes that are also highly ex- 
pressed are the strongest candidate oncogenes within an ampli- 
con. Perhaps more significantly, our parallel analysis of DNA 
copy number changes and mRNA levels provides us the oppor- 
tunity to assess the global impact of widespread DNA copy 
number alteration on gene expression in tumor cells. 

A strong influence of DNA copy number on gene expression 
is evident in an examination of the pseudocolor representations 
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Fig. 3. Concordance between DNA copy number and gene expression across chromosome 17. DNA copy number alteration {Upper) and mRNA levels {Lower) 
are illustrated for breast cancer cell lines and tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering (Upper), and the 
identical sample order is maintained (Lower). The 354 genes present on the microarrays and mapping to chromosome 1 7, and for which both DMA copy number 
and mRNA levels were determined, are ordered by position along the chromosome; selected genes are indicated in color-coded text (see Fig. 2 legend). 
Fluorescence ratios (test/reference) are depicted by separate log* pseudocolor scales (indicated). 



of DNA copy number and mRNA levels for genes on chromo- 
some 17 (Fig. 3). The overall patterns of gene amplification and 
elevated gene expression are quite concordant; i.e., a significant 
fraction of highly amplified genes appear to be correspondingly 
highly expressed. The concordance between high-level amplifi- 
cation and increased gene expression is not restricted to chro- 
mosome 17. Genome-wide, of 117 high-level DNA amplifica- 
tions (fluorescence ratios >4, and representing 91 different 
genes), 62% (representing 54 different genes; see Table 5, which 
is published as supporting information on the PNAS web site) 
are found associated with at least moderately elevated mRNA 
levels (mean-centered fluorescence ratios >2), and 42% (rep- 
resenting 36 different genes) are found associated with compa- 
rably highly elevated mRNA levels (mean-centered fluorescence 
ratios >4). 

To determine the extent to which DNA deletion and lower- 
level amplification (in addition to high-level amplification) are 
also associated with corresponding alterations in mRNA levels, 
we performed three separate analyses on the complete data set 
(4 cell lines and 37 tumors, across 6,095 genes). First, we 
determined the average mRNA levels for each of five classes 
of genes, representing DNA deletion, no change, and low-, 
medium-, and high-level amplification (Fig. 4a). For both the 



breast cancer cell lines and tumors, average mRNA levels 
tracked with DNA copy number across all five classes, in a 
statistically significant fashion (P values for pair-wise Student's 
t tests comparing adjacent classes: cell lines, 4 X 10~ 49 , 1 x 10~ 49 , 
5 x l<r 5 , 1 x 10" 2 ; tumors, 1 x 10" 43 , 1 x lOr 214 , 5 x 10" 41 , 
1 X 1G~ 4 ). A linear regression of the average log(DNA copy 
number), for each class, against average log(mRNA level) 
demonstrated that on average, a 2-fold change in DNA copy 
number was accompanied by 1.4- and 1.5-fold changes in mRNA 
level for the breast cancer cell lines and tumors, respectively (Fig. 
4a, regression line not shown). Second, we characterized the 
distribution of the 6,095 correlations between DNA copy num- 
ber and mRNA level, each across the 37 tumor samples (Fig. 46). 
The distribution of correlations forms a normal-shaped curve, 
but with the peak markedly shifted in the positive direction from 
zero. This shift is statistically significant, as evidenced in a plot 
of observed vs. expected correlations (Fig. 4c), and reflects a 
pervasive global influence of DNA copy number alterations on 
gene expression. Notably, the highest correlations between DNA 
copy number and mRNA level (the right tail of the distribution 
in Fig. 4b) comprise both amplified and deleted genes (data not 
shown). Third, we used a linear regression model to estimate the 
fraction of all variation measured in mRNA levels among the 37 
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Fig. 4. Genome-wide Influence of DNA copy number alterations on mRNA levels, (a) For breast cancer cell lines (gray) and tumor samples (black), both 
mean-centered mRNA fluorescence ratio (log 2 scale) quartiles (box plots indicate 25th, 50th, and 75th percentile) and averages (diamonds; V-value error bars 
indicate standard errors of the mean) are plotted for each of five classes of genes, representing DNA deletion (tumor/ normal ratio < 0.8). no change (0.8-1.2), 
low- (1.2-2), medium- (2-4). and high-level (>4) amplification, lvalues for pair-wise Students t tests, comparing averages between adjacent classes (moving 
left to right). are4 X 10-*» t x 10"* 9 . 5 X 10~ 5 , 1 X 10" 2 {eel Mines), and 1 x 10" 43 , 1 x 10" 214 , S x 10" 4t , 1 x 10-*(tumors).(b) Distribution of correlations between 
DNA copy number and mRNA levels, for 6.095 different human genes across 37 breast tumor samples, (c) Plot of observed versus expected correlation coefficients. 
The expected values were obtained by randomization of the sample labels in the DNA copy number data set. The line of unity is indicated, (d) Percent variance 
in gene expression (among tumors) directly explained by variation in gene copy number. Percent variance explained (black line) and fraction of data retained 
(gray line) are plotted for different fluorescence intensity/background (a rough surrogate for signal/noise) cutoff values. Fraction of data retained is relative 
to the 1.2 intensity/background cutoff. Details of the linear regression model used to estimate the fraction of variation in gene expression attributable to 
underlying DNA copy number alteration can be found in the supporting information (see Estimating the Fraction of Variation in Gene Expression Attributable 
to Underlying DNA Copy Number Alteration). 

tumors that could be attributed to underlying variation in DNA 
copy number. From this analysis, we estimate that, overaJl, about 
7% of ail of the observed variation in mRNA levels can be 
explained directly by variation in copy number of the altered 
genes (Fig. Ad). We can reduce the effects of experimental 
measurement error on this estimate by using only that fraction 
of the data most reliably measured (fluorescence intensity/ 
background >3); using that data, our estimate of the percent 
variation in mRNA levels directly attributed to variation in gene 
copy number increases to 12% (Fig. 4d). This still undoubtedly 
represents a significant underestimate, as the observed variation 
in global gene expression is affected not onJy by true variation in 
the expression programs of the tumor cells themselves, but also 
by the variable presence of non-tumor cell types within clinical 
samples. 

Discussion 

This genome-wide, array CGH analysis of DNA copy number 
alteration in a series of human breast tumors demonstrates the 
usefulness of defining amplicon boundaries at high resolution 
(gene-by-gene), and quantitatively measuring amplicon shape, to 
assist in locating and identifying candidate oncogenes. By ana- 
lyzing mRNA levels in parallel, we have also discovered that 
changes in DNA copy number have a large, pervasive, direct 
effect on global gene expression patterns in both breast cancer 



cell lines and tumors. Although the DNA microarrays used in our 
analysis may display a bias toward characterized and/or highly 
expressed genes, because we are examining such a large fraction 
of the genome (approximately 20% of all human genes), and 
because, as detailed above, we are likely underestimating the 
contribution of DNA copy number changes to altered gene 
expression, we believe our findings are likely to be generaiizable 
(but would nevertheless still be remarkable if only applicable to 
this set of -6,100 genes). 

In budding yeast, aneuploidy has been shown to result in 
chromosome-wide gene expression biases (13). Two recent 
studies have begun to examine the global relationship between 
DNA copy number and gene expression in cancer cells. In 
agreement with our findings, Phillips et ol. (14) have shown that 
with the acquisition of tumorigenicity in an immortalized pros- 
tate epithelial cell line, new chromosomal gains and losses 
resulted in a statistically significant respective increase and 
decrease in the average expression level of involved genes. In 
contrast, Platzer et aL (15) recently reported that in metastatic 
colon tumors only —4% of genes within amplified regions were 
found more highly (>2-fold) expressed, when compared with 
normal colonic epithelium. This report differs substantially from 
our finding that 62% of highly amplified genes in breast cancer 
exhibit at least 2-fold increased expression. These contrasting 
findings may reflect methodological differences between the 
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studies. For example, the study of Platzer et al. (15) may have 
systematically under-measured gene expression changes. In this 
regard it is remarkable that onJy 1 4 transcripts of many thousand 
residing within unampJified chromosomal regions were found to 
exhibit at least 4-fold aJtered expression in metastatic colon 
cancer. Additionally, their reliance on lower-resolution chromo- 
somal CGH may have resulted in poorly delimiting the bound- 
aries of high-complexity ampl icons, effectively overcalling re- 
gions with amplification. Alternatively, the contrasting findings 
for amplified genes may represent real biological differences 
between breast and metastatic colon tumors; resolution of this 
issue will require further studies. 

Our finding that widespread DNA copy number alteration has 
a large, pervasive and direct effect on global gene expression 
patterns in breast cancer has several important implications. 
First, this finding supports a high degree of copy number- 
dependent gene expression in tumors. Second, it suggests that 
most genes are not subject to specific autoregulation or dosage 
compensation. Third, this finding cautions that elevated expres- 
sion of an amplified gene cannot alone be considered strong 
independent evidence of a candidate oncogene's role in tumor- 
igenesis. In our study, fully 62% of highly amplified genes 
demonstrated moderately or highly elevated expression. This 
highlights the importance of high-resolution mapping of ampH- 
con boundaries and shape [to identify the "driving" gene(s) 
within amplicons (16)], on a large number of samples, in addition 
to functional studies. Fourth, this finding suggests that analyzing 



the genomic distribution of expressed genes, even within existing 
microarray gene expression data sets, may permit the inference 
of DNA copy number aberration, particularly aneuploidy (where 
gene expression can be averaged across large chromosomal 
regions; see Fig. 3 and supporting information). Fifth, this 
finding implies that a substantial portion of the phenotypic 
uniqueness (and by extension, the heterogeneity in clinical 
behavior) among patients* tumors may be traceable to underly- 
ing variation in DNA copy number. Sixth, this finding supports 
a possible role for widespread DNA copy number alteration in 
tumorigenesis (17, 18), beyond the amplification of specific 
oncogenes and deletion of specific tumor suppressor genes. 
Widespread DNA copy number alteration, and the concomitant 
widespread imbalance in gene expression, might disrupt critical 
stochioraetric relationships in cell metabolism and physiology 
(e.g., proteosome, mitotic spindle), possibly promoting further 
chromosomal instability and directly contributing to tumor 
development or progression. Finally, our findings suggest the 
possibility of cancer therapies that exploit specific or global 
imbalances in gene expression in cancer. 
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Hughes Medical Institute, the Norwegian Cancer Society, and the 
Norwegian Research Council 



1. Kallioniemi, A n Kallioniemi. O. P., Sudar, D., Rutovitz, D., Gray, J. W., 
WaJdraan, F. & Pinked D. (1992) Science 258, 818-821. 

2. Kallionicmi. A., Kallioniemi, O. P.. Piper, J., Tanner. M., Stokkc, 7A, Chen, L., 
Smith, R S., Pinkel, D.. Gray, J. W. & Waldman, F. M. (1994) Proc Nail Acad. 

. Sex. USA 91, 2156-2160. 

3. Tirkkonen, M., Tanner, M., Karhu, R., Kallioniemi, A., Isola. J. & Kallioniemi, 
O. P. (1998) Genes Chromosomes Cancer 21, 177-184. 

4. Forozan, F., Mahlamaki, E. H-, Mooni, O., Chen, Y„ Veldman, R„ Jiang, Y„ 
Goodcn, G. C, Ethier, S. P., Kallioniemi, A. & Kallioniemi, O. P. (2000) Cancer 
Res. 60, 4519-4525. 

5. Solinas-Toldo, S„ La mpel, S., Stttgenbauer, S., Nickolenko, J., Benner, A., Dohner, 
H. T Crcmcr, T. & Lichter, P. (1997) Genes Chromosomes Cancer 20, 399-407. 

6. Pinkel, D n Segraves, R., Sudar, D., Clark, S., Pooic, I., Kowbcl, D„ Collins, C, 
Kuo, W. U Chen, C, Zhai, Y., et al. (1998) Nat. Genet 20, 207-211. 

7. Pollack, J. R., Pcrou, C. M., Alizadeh, A A., Eisen, M. B., Pergamenschikov, 
A., Williams, C F„ Jeffrey, S. S., Botstcin, D. & Brown, P. O. (1999) Nat. Genet. 
23, 41-46. 

8. Perou, C. M., Sorlie, T., Eisen, M. B., van dc Rijn, M., Jeffrey, S. S., Rees, C A., 
Pollack, J. Ross, D. T., Johnsen, H., Aksien, L. « al. (2000) Nature 
(London) 406, 747-752. 

9. Sorlie, T., Perou, C. M., Tibshirani, R., Aas, T, Gctsler, S., Johnsen, H., Hastic, 



T., Eisen, M. B., van dc Rijn, M., Jeffrey, S. S., et al. (2001) Proc Natl. Acad. 
ScL USA 98, 10869-10874. 

10. Schuler, G. D. (1997) /. Mol hied. 75, 694-698. 

1 1 . Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C, Zody, M. C, Baldwin, 
J., Devon, K., Dewar, K., Doyle, M„ FitzHugh, W„ et aL (2001) Nature 
(London) 409, 860-921. 

12. Fejzo, M. $., Godfrey, T., Chen, C, Waldman, F. & Gray. J. W. (1998) Genes 
Chromosomes Cancer 22, 105-113. 

13. Hughes, T. R., Roberts, C J., Dai, H„ Jones, A. R., Meyer, M, R., Slade. D., 
Burchard, J„ Dow, S„ Ward, T. R., Kidd, M. J., et at (2000) Nat, Gtnei 25, 
333-337. 

14. Phillips, J. L., Hayward, S. W., Wang, Y., Vasseni, J., PaWovich, C. Padilla- 
Nash, H., Pezullo, J. R., Ghadimi, B. M., Grossfcld, G. D., Rivera, A., et ai 
(2001) Cancer Res. 61, 8143-8149. 

15. Plateer, P., Upender, M. B., Wilson, K,, Willis, J., Luttcrbaugb, J., Nosrati, A., 
wnison, J. IC, Mack, D., Ried, T. & Markowitz, S. (2002) Cancer Res. 62, 
1134-1138. 

16. Albertson, D. G., Ylstra, B., Segraves, R., Collins, C, Dairkee, S. H., Kowbel, 
D., Kuo, W. L., Gray, J. W. & Pinkel, D. (2000) Nat. Genet. 25, 
144-146. 

17. Li, R., Yerganian, G., Ducsberg, P„ Kraemer, A., Wilier, A„ Rausch, C Sc 
Hehlmann, R. (1997) Proc Natl Acad. Sci. USA 94, 14506-14511. 

18. Rasnick, D. & Duesbcrg, P. H. (1999) Biochem. J. 340, 621-630. 



12968 | www.pnas.org/c9i/doi/10.l073/pnas.16247l999 



Pollack et al. 



MM TECHNICAL UPDATE 

FROM YOUR LABORATORY SERVICES PROVIDER 

HER-2/neti Breast Cancer Predictive Testing 

Julie Sanford Hanna, Ph.D. and Dan Mornin. M.D. 



Each year, over 182,000 women in the United States axe 
diagnosed with breast cancer, and approximately 45,000 die 
of the disease. 1 Incidence appears to be increasing in the 
United States at a rate of roughly 2% per year. The reasons 
for the increase are unclear, but non-genetic risk factors appear 
to play a large role. 2 

Five-year survival rates range from approximately 65%- 
85%, depending on demographic group, with a significant 
percentage of women experiencing recurrence of their cancer 
within 10 years of diagnosis. One of the factors most predic- 
tive for recurrence once a diagnosis of breast cancer has been 
made is the number of axillary lymph nodes to which tumor 
has metastasized. Most node-positive women are given adju- 
vant therapy, which increases their survival. However, 20%- 
30% of patients without axillary node involvement also 
develop recurrent disease, and the difficulty lies in how to iden- 
tify this high-risk subset of patients. These patients could 
benefit from increased surveillance, early intervention, and 
treatment. 

Prognostic markers currently used in breast cancer recur- 
rence prediction include tumor size, histological grade, steroid 
hormone receptor status, DNA ploidy, proliferative index, and 
cathepsin D status. Expression of growth factor Teceptors and 
over-expression of the HER-2/neu oncogene have also been 
identified as having value regarding treatment regimen and 
prognosis. 

HER-2/neu (also known as c-erbB2) is an oncogene that 
encodes a transmembrane glycoprotein that is homologous 
to, but distinct from, the epidermal growth factor receptor. 
Numerous studies have indicated that high levels of expres- 
sion of this protein are associated with rapid tumor growth, 
certain forms of therapy resistance, and shorter disease- free 
survival. The gene has been shown to be amplified and/or 
overexpressed in W%-30% of invasive breast cancers and in 
40%-60% of intraductal breast carcinoma. 3 

There are two distinct FDA-approved methods by which 
HER-2/neu status can be evaluated: immunohistochemistry 
(IHC, HercepTest™) and FISH (fluorescent in situ hybridiza- 
tion, PathVysion™ Kit). Both methods can be performed on 
archived and current specimens. The first method allows visual 
assessment of the amount of HER-2/neu protein present on 
the cell membrane. The latter method allows direct quantifi- 
cation of the level of gene amplification present in the tumor, 
enabling differentiation between low- versus high-amplifica- 
tidh. At least one study has demonstrated a difference in 



recurrence risk in women younger than 40 years of age for 
low- versus high-amplified tumors (54.5% compared to 
85.7%); this is compared to a recurrence rate of 16.7% for 
patients with no HER-2/neu gene amplification. 4 HER-2/neu 
status may be particularly important to establish in women with 
small (< 1 cm) tumor size. 

The choice of methodology for determination of HER-2/ 
neu status depends in part on the clinical setting. FDA approval 
for the Vysis FISH test was granted based on clinical trials 
involving 1549 node-positive patients. Patients received one 
of three different treatments consisting of different doses of 
cyclophosphamide, Adriamycin, and 5-fluorouracil (CAF). 
The study showed that patients with amplified HER-2/neu 
benefited from treatment with higher doses of adriamycin- 
based therapy, while those with norma) HER-2/neu levels did 
not The study therefore identified a sub- set of women, who 
because they did not benefit from more aggressive treatment, 
did not need to be exposed to the associated side effects. In 
addition, other evidence indicates that HER-2/neu amplifica- 
tion in node-negative patients can be used as an independent 
prognostic indicator for early recurrence, recurrent disease at 
any time and disease-related death. 5 Demonstration of HER- 
2/neu gene, amplification by FISH has also been shown to be 
of value in predicting response to chemotherapy in stage-2 
breast cancer patients. 

Selection of patients for Herceptin® (Trastuzumab) mono- 
clonal antibody therapy, however, is based upon demonstra- 
tion of HER-2/neii protein overexpression using HercepTest™. 
Studies using Herceptin® in patients with metastatic breast 
cancer show an increase in time to disease progression, 
increased response rate to chemotherapeutic agents and a small 
increase in overall survival rate. The FISH assays have not yet 
been approved for this purpose, and studies looking at response 
to Herceptin 0 in patients with or without gene amplification 
status determined by FISH are in progress. 

In general, FISH and IHC results correlate well. However, 
subsets of tumors are found which show discordant results; 
i.e., protein overexpression without gene amplification or lack 
of protein overexpression with gene amplification. The clini- 
cal significance of such results is unclear. Based on the above 
considerations, HER-2/neu testing at SHMC/PAML will uti- 
lize immunohistochemistry (HercepTest 0 ) as a screen, fol- 
lowed by FISH in IHC-negative cases. Alternatively, either 
method may be ordered individually depending on the clini- 
cal setting or clinician preference. 
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CPT code information 

HER-2/neuvia IHC 

88342 (including interpretive report) 

HER-2/neii via FISH 

8827 1 *2 Molecular cytogenetics, DNA probe, each 
88274 Molecular cytogenetics, interphase in situ hybrid- 
ization, analyze 25-99 cells 
8829 1 Cytogenetics and molecular cytogenetics, interpre- 
tation and report 



Procedural Information 

1mm ^histochemistry is performed using the FDA-approved 
DAKO antibody kit, Herceptest*. The DAKO kit contains 
reagents required to complete a two-step immunohisto- 
chemical staining procedure for.routinely processed, paraffin- 
embedded specimens. Following incubation with the primary 
rabbit antibody to human HER-2/neu protein, the kit employs 
a ready-to-use dextran-based visualization reagent. This re- 
agent consists of both secondary goat anti-rabbit antibody 
molecules with horseradish peroxidase molecules tinked to a 
common dextran polymer backbone, thus eliminating the need 
for sequential application of link antibody and peroxidase 
conjugated antibody. Enzymatic conversion ef the subse- 
quently added chromogen results in formation of visible 
reaction product at the antigen site, the specimen is then coun- 
terstained; a pathologist using light-microscopy interprets 

results. , 

FISH analysis at SHMC/PAML is performed using the 
FD A-approved Path Vy sion™ HER-2/neu DNA probe kit, pro- 
duced by Vysis, Inc. Formalin fixed, paraffin-embedded breast 
tissue is processed using routine histological methods, and then 
slides are treated to allow hybridization of DNA probes to the 
nuclei present in the tissue section. The Pathvysion™ kit con- 
tains two direct-labeled DNA probes, one specific for the 
alphoid repetitive DNA (CEP 1 7, spectrum orange) present at 
the chromosome 17 centromere and the second for the HER- 
2/neu oncogene located at 17q 11. 2- 12 (spectrum green). Enu- 
meration of the probes allows a ratio of the number of copies 
of chromosome 17 to the number of copies of HER-2/neu to 
be obtained; this enables quantification of low versus high 
amplification levels, and allows an estimate of the percentage 
or cells with HER-2/neu gene amplification. The clinically 
relevant distinction is whether the gene amplification is due. 
to increased gene copy number on the two chromosome 17 
homologues normally present or an increase in the number of 
chromosome 17s in the cells. In the majority of cases, ratio 
equivalents less than 2.0 are indicative of a normal/negative 
result, ratios of 2.1 and over indicate that amplification is 
present and to what degree. Interpretation of this data will be 
performed and reported from the Vy sis-certified Cytogenet- 
ics laboratory at SHMC. 
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hyLriSon \ZS* ^ fc "**»*«. descent In sftu 
hybnd.zat.on and comparative genomic hybridization (CGHV 
have revealed chromosomal aberrations that seem to be 
charaetenstic of certain stages of disease progress^ n me 
case of non-invasive pTa transitional cell M^ncZ 
this includes loss of chromosome 9 or parts T« S wefas 
bss of Y in males. In minimally invasive pT1 TO^the oT 
lowing alterations have been reported: 2q- vS- loJ 

t3;jo 7 n q s +, h a r r ^ : ~ £ 

nonl ! 9 "arbor .tumor suppressor genes and onco- 
genes; however, the large chromosomal areas involved often 
contam many genes, making meaningfu. predicts of tne 
functional consequences of losses and gains very difficult 

n this investigation we have combined genome ^Jch- 
nology for detecting genomic gains and losses (££2, 
gene expression profiling, techniques (microan-ays and pT 
teom.es) to determine the effect of gene copy'numL ^ 
^script and protein leveis in pairs of non-ive ^ in- 
vasive human bladder TCCs. 

EXPERIMENTAL PROCEDURES 
Material- Bladder tumor biopsies were samoieri • . 
consent was obtained and ane^o^cf SS 

Sla9edby 30 e ^"enced pathologist as pTa (supedicla. papfl^ 



^Lrr ,s ' assoda,ed ,atty acw - bindi ^ 2 2 





Fig. 1. DNA copy number and mRNA expression level, f^-^ 
expression level o1 specific genes, and overaB ^^^^^^J^^^^r 8 27 compared with the non-Invasive 
compared with the non-invasive counterpart tumor 335. 8 • ^ re f ,0 " '""^A sXwn along the length oi the chromosome 

counterpart tumor 532. The average fluorescent signal ratio between «^WAand £xma J^^* Wn curves Micating one standard 
Jeft). Tne boW curve in the ratio profiie represents a mean rf / o ^^ 0 c ^^ a S ^S , ;^ n£ to it (dotted) indicate a ratio of 
deviation. The centra/ vertical line (broAen) s alterations In DNA content, the ratio 

0.5 (te«) and 2.0 fright). In chromosomes where the in- nva ^» u ^™ fi ^^ r c ^f! r ^ > fS repreS ents one gene each, identified by the 
profile of that chromosome Is shown to the rigfif of the .nvasive tumor Profile. Th^ L"^^^ ^ Micale the purported location of 
inning numbera above the bars (the name of the gene can be seen wflh^no^vas.e^erpart^ >2-fo,d 

the gene, and the colors indicate the expression level of * to^STentitled Expression shows the resulting change 
increase (bfacA). >2-fold decrease (blue), no ^"^S ^^^oTtoe oe^wtra u^eguteted (pfack). at least half of the genes 
m expression along the chromosome; the colors mdicate that a -^^^^Z^^^ in 0 £ of the samples and present in 
Regulated (blue), or more than haH of the gen?s are »$^J™£J* ^cSSd to one standard deviation in a double 

rnir^^^ 



grade I and II. respectively, tumors 733 and 827 were staged as pT1 
(invasive into submucosa). 733 was staged as solid, and 827 was 
staged as papillary, both grade III. ' 

mRNA Preparation -Tissue biopsies, obtained fresh from surgery, 
were embedded immediately In a sodium-guanidinlum Ihiocyanate 
solution and stored at -80 -C. Total RNA was isolated using the 
RNA20I B RNA isolation method (WAK-Chemie Medical GMBH). 
poiyfAT RNA was isolated by an ongo(dT) selection step (Ohgotex 
mRNA kit; Qiagen). . 

cRNA Preparation- 1 »9 of mRNA was used as starting material. 
The first and second strand cDNA synthesis was performed using the 
Superscript® choice system (Invltrogen) according to the manufac- 
turer's instructions but using an oligcXdT) primer containing a T7 RNA 
polymerase binding site. Labeled cRNA was prepared I using ^the ME- 
GAscrip® in vitro transcription kit (Ambion). Biotin-labeled CTP and 



OTP (Enzo) was used, together with unlabeled NTPs In the reaction. 
Following the In vitro transcription reaction, the unincorporated nu- 
cleotides were removed using RNeasy columns (Qiagen). 

Array Hybridization and Scanning- Array hybridization and scan- 
ning was modified from a previous method (13). 10 ^g of cRNA was 
fragmented at 94 'C for 35 min in buffer containing 40 tm Tris 
acetate. pH 8.1 . 100 mM KOAc, 30 mM MgOAc. Prior to hybndizauon, 
Sragmented'cRNA in a 6X SSPE-T hybridization bufler (1 m NaCI. 
10 m M Tris. pH 7.6. 0.005% Triton), was heated to 95 «C or 5 n*. 
subsequently cooled to 40 'C. and loaded ontothe Atfyrnetnx probe 
array cartridge. The probe array was then incubated for 16 h at 40 C 
at constant rotation (60 rpm). The probe array was expend to 10 
washes in 6x SSPE-T at 25. 'C followed by 4 washes in 0.5X SSPE-T 
at 50 "C. The bkrtinylated cRNA was stained with a strepta^dm- 
phycoerythrin conjugate. 10 ^g/ml (Molecular Probes) m 6x SSPE-T 
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Fig. 1— continued 



for 30 rnin at 25 'C followed by 1 0 washes in 6x SSPE-T at 25 °C. The 
probe arrays were scanned at 560 nm using a conf ocal laser scanning 
microscope (made for Asymetrix by Hewlett-Packard). The readings 
from the quantitative scanning were analyzed by Affymetrix gene 
expression analysis software. 

Microsateltite Analysis— Microsalellite Analysis was performed as 
described, previously (14). Microsatellites were selected by use of 
www.nobl.nlm.nih.gov/genemap98, and primer sequences were ob- 
tained from the genome data base at www.gdb.org. DNA was extracted 
from tumor and Wood and amplified by PGR in a volume of 20 for 3$ 
cycles. The ampBcons were denatured and electrophoresed for 3 h in an 
ABI Prism 377. Data were collected in the Gene Scan program for 
fragment analysis, toss of heterozygosity was defined as less than 33% 
of one allele detected in tumor amp! icons compared with blood. 

Proteomic Analysis— TCCs were minced into small pieces and 
homogenized in a small glass homogenizer in 0.5 ml of lysis solution. 
Samples were stored at -20 °C until use. The procedure for 2D gel 
electrophoresis has been described in detail elsewhere (15. 1 6). Gels 
were stained with silver nitrate and/or Coomassie Brilliant Blue. Pro- 
teins were Identified by a combination of procedures that Included 
microsequencing, mass spectrometry, two-dimensional gel Western 
imrnunoblotting, and comparison with the master two-dimensional gel 
image of human keratinocyte proteins; see biobase.dk/cgi-bin/celis, 

CGH- Hybridization of differentially labeled tumor and normal DNA 
to normal metaphase chromosomes was performed as described 
previously (10). Fluorescein-labeled tumor DNA (200 ng), Texas Red- 



labeled reference DNA (200 ng), and human Cot-1 DNA (20 fig) were 
denatured at 37 °C for 5 min and applied to denatured normal met- 
aphase slides. Hybridization was at 37 "C for 2 days. After washing, 
the slides were counterstained with 0.15 jig/ml 4,6-diamidino-2-phe- 
nylindole in an anti-fade solution. A second hybridization was per- 
formed for all tumor samples using fluorescein-labeled reference DNA 
and Texas Red-labeled tumor DNA (Inverse labeling) to confirm the 
aberrations detected during the initial hybridization. Each CGH ex- 
periment also included a normal control hybridization using fluores- 
cein- and Texas Red-labeled normal DNA. Digital image analysis was 
used to identify chromosomal regions with abnormal fluorescence 
ratios, indicating regions of DNA gains and losses. The average 
green;red fluorescence intensity ratio profiles were calculated using 
four images of each chromosome (eight chromosomes total) with 
normalization of the greenrred fluorescence intensity ratio for the 
entire metaphase and background correction. Chromosome identifi- 
cation was performed based on 4,6-dIamidino-2-phenylindole band- 
ing patterns. Only images showing uniform high Intensity fluores- 
cence with minimal background staining were analyzed. All 
centromeres, p arms of acrocentric chromosomes, and heterochro- 
matic regions were excluded from the analysis. 

RESULTS 

Comparative Genomic Hybridization— The CGH analysis 
identified a number of chromosomal gains and losses in the 
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Table I 

Correlation between alterations detected by CGH and by expression monitoring 
Top, CGH used as Independent variable (if CGH alteration - what expression ratio was found); bottom, altered expression used as 
independent variable (if expression alteration - what CGH deviation was found). 



CGH alterations 



Tumor 733 vs. 335 
Expression change clusters 



Concordance 



CGH alterations 



Tumor 827 vs. 532 
Expression change clusters 



Concordance 



13 Gain 



10 Loss 



10 Up-regulation 

0 Down-regulation 

3 No change 

1 Up-regulation 

5 Down-regulation 

4 No change 



77% 



50% 



10 Gain 
12 Loss 



8 Up-regulation 
0 Down-regulation 

2 No change 

3 Up-regulation 

2 Down regulation 
7 No change 



Expression change clusters 



Tumor 733 vs. 335 
CGH alterations 



Concordance Expression change clusters 



Tumor 827 vs. 532 
CGH alterations 



80% 
17% 

Concordance 



16 Up-regulation 
21 Down-regulation 
15 No change 



11 Gain 

2 Loss 

3 No change 
1 Gain 

8 Loss 

12 No change 
3 Gain 

3 Loss 

9 No change 



69% 
38% 
60% 



17 Up-regulation 
9 Down-regulation 
21 No change 



10 Gain 

5 Loss 

2 No change 

0 Gain 

3 Loss 

6 No change 

1 Gain 
3 Loss 

1 7 No change 



59% 
33% 
81% 



two invasive tumors (stage pT1, TCCs 733 and 827), whereas 
the two non-invasive papillomas (stage pTa, TCCs 335 and 
532) showed only 9p-, 9q22-q33~, and X-, arid 7+, 9q-, 
and Y-, respectively. Both Invasive tumors showed changes 
(1q22-24+, 2q14.1-qter-. 3q12-q13.3-, 6q12-q22-, 
9q34+, 11q12-q13+, 17+, and 20q11.2-q12+) that are typ- 
ical for their disease stage, as well as additional alterations, 
some of which are shown in Fig. 1. Areas with gains and 
losses deviated from the normal copy number to some extent, 
and the average numerical deviation from normal was 0.4-fold 
in the case of TCC 733 and 0.3-fold for TCC 827. The largest 
changes, amounting to at least a doubling of chromosomal 
content, were observed at 1q23 in TCC 733 (Fig. 1,4) and 
20q12inTCC 827 (Fig. 1S). 

mRNA Expression in Relation to DNA Copy Number-Trie 
mRNA levels from the two invasive tumors (TCCs 827 and 
733) were compared with the two non-invasive counterparts 
(TCCs 532 and 335). This was done in two separate experi- 
ments in which we compared TCCs 733 to 335 and 827 to 
532, respectively, using two different scaling settings for the 
arrays to rule out scaling as a confounding parameter. Ap- 
proximately 1,800 genes that yielded a signal on the arrays 
were searchedin the Unigene and Genemap data bases for 
chromosomal location, and those with a known location 
(1096) were plotted as bars covering their purported locus. In 
that way it was possible to construct a graphic presentation of 
DNA copy number and relative mRNA levels along the Indi- 
vidual chromosomes (Fig. 1). 

For each mRNA a ratio was calculated between the levelin 
the invasive versus the non-invasive counterpart. Bars, which 
represent chromosomal location of a gene, were color-coded 
according to the expression ratio, and only differences larger 



than 2-fold were regarded as informative (Fig. 1). The density 
of genes along the chromosomes varied, and areas contain- 
ing only one gene were excluded from the calculations. The 
resolution of the CGH method is very low, and some of the 
outlier data may be because of the fact that the boundaries of 
the chromosomal aberrations are not known at high resolution. 

Two sets of calculations were made from the data. For the 
first set we used CGH alterations as the independent variable 
and estimated the frequency of expression alterations in these 
chromosomal areas. In general, areas with a strong gain of 
chromosomal material contained a cluster of genes having 
increased mRNA expression. For example, both chrome-' 
somes 1q21-q25, 2p and 9q. showed a relative gain of more 
than 100% in DNA copy number that was accompanied by 
increased mRNA expression levels In the two tumor pairs (Fig. 
1). In most cases, chromosomal gains detected by CGH were 
accompanied by an increased level of transcripts in both 
TCCs 733 (77%) and 827 (80%) (Table I, fop). Chromosomal 
losses, on the other hand, were not accompanied by de- 
creased expression in several cases, and were often regis- 
tered as having unaltered RNA levels (Table I, fop). The inabil- 
ity to detect RNA expression changes in these cases was not 
because of fewer genes mapping to the lost regions (data not 
shown). 

In the second set of calculations we selected expression 
alterations above 2-fold as the independent variable and es- 
timated the frequency of CGH alterations in these areas. As 
above, we found that increased transcript expression corre- 
lated with gain of chromosomal material (TCC 733, 69% and 
TCC 827, 59%), whereas reduced expression was often de- 
tected in areas with unaltered CGH ratios (Table I, bottom). 
Furthermore, as a control we looked at areas with no alter- 
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Tumor 827 versus 532 

Fig. 2. Correlation between maximum CGH aberration and the ability to detect expression change by oligonucleotide array 
monitoring. The aberration is shown as a numerical -fold change In ratio between invasive tumors 827 (▲) and 733 (♦) and their non-invasive 
counterparts 532 and 335. The expression change was taken from the Expression fine to the right in Fig. 1, which depicts the resulting 
expression change for a given chromosomal region. At least half of the mRNAs from a given region have to be either up- or down-regulated 
to be scored as an expression change. All chromosomal arms in which the CGH ratio plus or minus one standard deviation was outside the 
ratio value of one were Included. 



atiori in expression. No alteration was detected by CGH in 
most of these areas (TCC 733, 60% and TCC 827, 81 %; see 
Table I, bottom). Because the ability to observe reduced or 
increased mRNA expression clustering to a certain chromo- 
somal area clearly reflected the extent of copy number 
changes, we plotted the maximum CGH aberrations in the 
regions showing CGH changes against the ability to detect a 
change in mRNA expression as monitored by the oligonucleo- 
tide arrays (Fig. 2)££pr both tumors TCC 733 (p < 0.015) and 
TCC 827 (p < 0.00003) a highly significant correlation was 
observed between the level of CGH ratio change (reflecting 
the DNA copy number) and alterations detected by the array 
based technology (Fig. 2^ Similar data were obtained when 
areas with altered expression were used as independent vari- 
ables. These areas correlated best with CGH when the CGH 
ratio deviated 1.6- to 2.0-fold (Table I, bottom) but mostly did 
not at lower CGH deviations. These data probably reflect that 
loss of an allele may only lead to a 50% reduction in expres- 
sion level, which Is at the cut-off point for detection of expres- 
sion alterations. Gain of chromosomal material can occur to a 
much larger extent. 

MicrosatelJite- based Detection of Minor Areas of Loss- 
es—In TCC 733, several chromosomal areas exhibiting DNA 
amplification were preceded or followed by areas with a nor- 
malCGH but. reduced mRNA expression (see Fig. 1, TCC 733 
chromosome 1q32 ( 2p21. and 7q21 and q32, 9q34, and 
10q22). To determine whether these results were because of 
undetected loss of chromosomal material in these regions or 



because of other non-structural mechanisms regulating tran- 
scription, we examined two mlcrosatellites positioned at chro- 
mosome 1q25-32 and two at chromosome 2p22. Loss of 
heterozygosity (LOH) was found at both 1q25 and at 2p22 
indicating that minor deleted areas were not detected with the 
resolution of CGH (Fig. 3). Additionally, chromosome 2p in 
TCC 733 showed a CGH pattern of gain/no change/gain of 
DNA that correlated with transcript increase/decrease/in- 
crease. Thus, for the areas showing increased expression 
there was a correlation with the DNA copy number alterations 
. (Fig. 1 A). As indicated above, the mRNA decrease observed in 
the middle of the chromosomal gain was because of LOH, 
implying -that one of the mechanisms for mRNA down-regu- 
lation may be regions that have undergone smaller losses of 
chromosomal material. However, this cannot be detected with 
the resolution of the CGH method. 

In both TCC 733 and TCC 827, the telomeric end of chro- 
mosome 11 p showed a normal ratio in the CGH analysis; 
however, clusters of five and three genes, respectively, lost 
their expressioa Two microsatellites (D11S1760, D11S922) 
positioned close to MUC2, IGF2, and cathepsin D indicated 
LOH as the most likely mechanism behind the loss of expres- 
sion (data not shown). 

A reduced expression of mRNA observed In TCC 733 at 
chromosomes 3q24, 11p11, 12p12.2, 12q21.1, and 16q24 
and in TCC 827 at chromosome 11p15.5, 12p11, 15q11.2, 
and 18q12 was also examined for chromosomal losses using 
microsatellites positioned as close as possible to the gene loci 
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Fig. 3. Microsatellite analysis of loss of heterozygosity. Tumor 
733 showing loss of heterozygosity at chromosome 1q25. detected 
(a) by D1S215 close to Hu class I histocompatibility antigen (gene 
number 38 In Fig. 1), (b) by Q1S2735 dose to cathepsin E (gene 
number 41 in Fig. 1), and (c) at chromosome 2p23 by 02S2251 close 
to general /9-spectrin (gene number 1 1 on Rg. 1) and of (d) tumor 827 
showing loss of heterozygosity at chromosome 18q12 by S18S1 118 
close to mitochondrial 3-oxoacyl-coenzyme A thiolase (gene number 
12 in Rg. 1). The upper curves show the electropherogram obtained 
from normal DNA from leukocytes (N), and the tower curves show the 
electropherogram from tumor DNA (7). In all cases one allele is 
partially lost in the tumor amplicon. 

showing reduced mRNA transcripts. Only the microsatellite 
positioned at 18q12 showed LOH <Rg. 3), suggesting that 
transcriptional down-regulation of genes in the other regions 
may be controlled by other mechanisms. 

Relation between Changes in mRNA and Protein Levels— 
2D-PAGE analysis, in combination with Coomassie Brilliant 
Blue .and/or silver staining, was carried out on all four tumors 
using fresh biopsy material. 40 well resolved abundant known 
proteins migrating in areas away from the edges of the pH 
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Fig. 4. Correlation between protein levels as judged by 20- 
PAGE and transcript ratio. For comparison proteins were divided In 
three groups, unaltered in level or up- or down-regulated {horizontal 
axis), the mRNA ratio as determined by oligonucleotide arrays was 
plotted for each gene [vertical axis). ▲, mRNAs that were scored as 
present in both tumors used for the ratio calculation; A, mRNAs that 
were scored as absent in the invasive tumors (along horizontal axis) or 
as absent in non-Invasive reference (top of figure). Two different 
scalings were used to exclude scaling as a confounder, TCCs 827 
and 532 (AA) were scaled with background suppression, and TCCs 
733 and 335 (#0) were scaled without suppressioa Both compari- 
sons showed highly significant (p < 0.005) differences in mRNA ratios 
between the groups. Proteins shown, were as follows: Group A (from 
/eft), phosphoglucomutase 1, glutathione transferase class m number 
4, fatty acid-binding protein homologue, cytokeratin 15, and cyto- 
keratin 13; B (from left), fatty acid-binding protein homologue. 28-kDa 
heat shock protein, cytokeratin 1 3, and calcycfin; C (from /eft), a-eno- 
lase, hnRNP B1, 28-kDa heat shock protein, 14-3-3-e, and 
pre-mRNA splicing factor 0. mesothelial keratin K7 (type II); E (from 
fop), glutathione S-transferase-ir and mesothelial keratin K7 (type II); 
F(from top and left), adenyryl cyclase-associated protein, E-cadherin, 
keratin 19, calglzzarin, phosphoglycerate mutase, annexin IV, cy- 
toskeletal r-actin, hnRNP A1, integral membrane protein calnexin 
GP90). hnRNP H. brain-type clathrin light chain-a, hnRNP F, 70-kDa 
heat shock protein, heterogeneous nuclear ribonucleoprotein A/B, 
translationaJly controlled tumor protein, liver glyceraldehyde-3-phos- 
phate dehydrogenase, keratin 8, aldehyde reductase, and Na,K* 
ATPase 0-1 subunit; G, (from fop and /eft), TCP20, calgizzarin, 70- 
kDa heat shock protein, calnexin, hnRNP H, cytokeratin 15, ATP 
synthase, keratin 19, Iriosephosphate isomerase, hnRNP.F, liver glyc- 
eraldehyde-3-phosphatase dehydrogenase, glutathione S-transfer- 
ase-ir, and keratin 8; H (from /eft), plasma gelsolin. autoantigen cal- 
reticulin, thioredoxin, and NAD+-dependent 15 hydroxyprostaglandin 
dehydrogenase; / (from top), prolyl 4-hydroxyfase 0-subunit, cyto- 
keratin 20, cytokeratin 17, prohibition, and fructose 1,6-biphos- 
phatase; J annexin II; K, annexin IV; L (from top and teff), 90-kDa heat 
shock protein, prolyl 4-hydroxylase p-subunit, a-enolase, GRP 78, 
cyclophflln, and cofilin. 

gradient, and having a known chromosomal location, were 
selected for analysis In the TCC pair 827/532. Proteins were 
identified by a combination of methods (see "Experimental 
Procedures"). In general there was a highly significant corre- 
lation (p < 0.005) between mRNA and protein alterations (Rg. 
4). Only one gene showed disagreement between transcript 
alteration and protein alteration. Except for a group of cyto- 
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Fks. 5. Comparison of protein and transcript levels in invasive 
and non-invasive TCCs. The upper part of the figure shows a 2D gel 
{left) and the oligonucleotide array (r/gnf) of TCC 532. The red rectan- 
gles on the upper gel highlight the areas that are compared below. 
Identical areas of 2D gels of TGCs 532 and 827 are shown below. 
Clearly, cytokeratins 13 and 15 are strongly down-regulated in TCC 
827 (red annotation). The tile on the array containing probes for 
cytokeratin 15 is enlarged below the array (red arrow) from TCC 532 
and is compared with TCC 827. The upper row of squares in each tile 
corresponds to perfect match probes; the tower row corresponds to 
mismatch probes containing a mutation (used for correction for un- 
specific binding). Absence of signal is depicted as black, and the 
higher the signal the lighter the color. A high transcript level was 
detected in TCC 532 (6151 units) whereas a much lower level was 
detected in TCC 827 (absence of signals). For cytokeratin 13, a high 
transcript level was also present in TCC 532 (15659 units), and a 
much lower level was present in TCC 827 (623 units). The 2D gels at 
the bottom of the figure (/eft) show levels of PA-FABP and adipocyte- 
FABP In TCCs 335 and 733 (invasive), respectively. Both proteins are 
down-regulated in the invasive tumor. To the right we show the array 
tiles for the PA-FABP transcript. A medium transcript level was de- 
tected in the case of TCC 335 (1277 units) whereas very low levels 
were detected In TCC 733 (166 units). IEF. isoelectric focusing. 



keratins encoded by genes on chromosome 17 (Fig. 5) the 
analyzed proteins did not belong to a particular family. 26 well 
focused proteins whose genes had a know chromosomal 
location were detected in TCCs 733 and 335, and of tKese 19 
correlated (p < 0.005) with the rnRNA changes detected using 
the arrays (Fig. 4). For example, PA-FABP was highly ex- 
pressed in the non-invasive TCC 335 but lost In the invasive 
counterpart (TCC 733;' see Fig. 5). The smaller number of 
proteins detected In both 733 and 335 was because of the 
smaller size of the biopsies that were available. 

11 chromosomal regions where CGH showed aberrations 
that corresponded to the changes in transcript levels also 
showed corresponding changes in the protein level (Table II). 
These regions included genes that encode proteins that are 
found to be frequently altered In bladder cancer, namely 
cytokeratins 17 and 20, annexins II and IV, and the fatty 
acid-binding proteins PA-FABP and FBP1. Four of these pro- 
teins were encoded by genes in chromosome 17q, a fre- 
quently amplified chromosomal area in invasive bladder 
cancers. 

. DISCUSSION 

Most human cancers have abnormal DNA content, having 
lost some chromosomal parts and gained others. The present 
study provides some evidence as to the effect of these gains 
and losses on gene expression in two pairs of non-invasive 
and invasive TCCs using high throughput expression arrays 
and proteomics, in combination with CGH. In general, the. 
results showed that there is a clear individual regulation of the 
rnRNA expression of single genes, which in some cases was 
superimposed by a DNA copy number effect In most cases, 
genes located in chromosomal areas with gains often exhib- 
ited increased rnRNA expression, whereas areas showing 
losses showed either no change or a reduced rnRNA expres- 
sion. The latter might be because of the fact that losses most 
often are restricted to loss of one allele, and the cut-off point 
for detection of expression alterations was a 2-fold change, 
thus being at the border of detection. In several cases, how- 



Table II 

Proteins whose expression level correlates with both rnRNA and gene dose changes 



Protein 



Chromosomal location Tumor TCC CGH alteration Transcript alteration* Protein alteration 



Annexin II 
Annexin IV 
Cytokeratin 17 
Cytokeratin 20 
(PA-JFABP 
. FBP1 

Plasma geJsolin 
Heat shock protein 28 
Prohibits 
ProlyM-hydroxyl 
hnRNPB! 



1q21 
2p13 

17q12-q21 

17q21.1 

8q21.2 

9q22 

9q31 

15q12-q13 
17q21 
17q25 
7p15 



733 

733 

827 

827 

827 

827 

827 

827 
827/733 
827/733 

827 



Gain 
Gain 
Gain 
Gain 
Loss 
Gain 
Gain 
Loss 
Gain 
Gain 
Loss 



Abs to Pres* 
3.9-Fold up 
3.8-Fold up 

5.6- Fold up 
10-Fold down 
2.3-Fold up 
Abs to Pres 
2.5-Fold up 

3.7- /2.5- Fold up 6 
5.7-/1 .6-Fold up 
2.5-Fold down 



Increase 

Increase 

Increase 

Increase 

Decrease 

Increase 

Increase 

Decrease 

Increase 

Increase 

Decrease 



• Abs,. absent; Pres, present. ^ 

* In cases where the corresponding alterations were found in both TCCs 827 and 733 these are shown as 827/733. 
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ever, an increase or decrease in DMA copy number was 
associated with de novo occurrence or complete loss of tran- 
script, respectively. Some of these transcripts could not be 
detected in the non-invasive tumor but were present at rela- 
tively high levels in areas with DNA amplifications In the inva- 
sive tumors (e.g. in TCC 733 transcript from cellular ligand of 
annexin II gene (chromosome 1q21) from absent to 2670 
arbitrary units; in TCC 827 transcript from small proline-nch 
protein 1 gene (chromosome 1q12-q21.1) from absent to 
1326 arbitrary units). It may be anticipated from these data 
that significant clustering of genes with an Increased expres- 
sion to a certain chromosomal area Indicates an increased 
likelihood of gain of chromosomal material in this area 

Considering the many possible regulatory mechanisms act- 
ing at the level of transcription, It seems striking that the gene 
dose effects were so clearly detectable in gained areas. One 
hypothetical explanation may lie In the loss of controlled 
methylation in tumor cells (17-19). Thus, it may be possible 
' that in chromosomes with increased DNA copy numbers two 
or more alleles could be demethylated simultaneously leading 
to a higher transcription level, whereas in chromosomes with 
losses the remaining allele could be partly methylated, turning 
off the process (20, 21). A recent report has documented a 
ploidy regulation of gene expression in yeast, but in this case all 
the genes were present in the same ratio (22), a situation that is 
not analogous to that of cancer cells, which show marked 
chromosomal aberrations, as well as gene dosage effects. 

Several CGH studies of bladder cancer have shown that 
some chromosomal aberrations are common at certain 
stages of disease progression, often occurring in more than 1 
of 3 tumors. In pTa tumors, these include 9p-. 9q-, 1q+, Y- 
(2, 6), and in pT1 tumors, 2q-,t1p-, 1 1q~. 5p+. 8 ^ + * 
17q+, and 20q+ (2-4, 6, 7). The pTa tumors studied here 
showed similar aberrations such as 9p- and 9q22-q33- and 
9q - and Y- , respectively. Likewise, the two minimal Invasive 
pT1 tumors showed aberrations that are commonly seen at 
that stage, and TCC 827 had a remarkable resemblance to the 
commonly seen pattern of losses and gains, such as 1q22-24 
amplification (seen in both tumors), 11q14-q22 loss, the latter 
often linked to 17 q+ (both tumors), and 1q+ and 9p-, often 
linked to 20q+ and 11 q13+ (both tumors) (7-9). These ob- 
servations indicate that the pairs of tumors used in this study 
exhibit chromosomal changes observed in many tumors, and 
therefore the findings could be of general importance for 
. bladder cancer. 

Considering that the mapping resolution of CGH is of about 
20 megabases it is only possible to get a crude picture of 
chromosomal instability using this technique. Occasionally, 
we observed reduced transcript levels close to or inside re- 
gions with increased copy numbers. Analysis of these regions 
by positioning heterozygous microsatellites as close as pos- 
sible to the locus showing reduced gene expression revealed 
loss of heterozygosity in several cases. It seems likely that 
multiple and different events occur along each chromosomal 



arm and that the use of cDNA microarrays for analysis of DNA 
copy number changes will reach a resolution that can resolve 
these changes, as has recently been proposed (2). The outlier 
data were not more frequent at the boundaries of the CGH 
aberrations. At present we do not know the mechanism be- 
hind chromosomal aneuploldy and cannot predict whether 
chromosomal gains will be transcribed to a larger extent than 
the two native alleles. A mechanism as genetic Imprinting has 
an impact on the expression level in normal cells and is often 
reduced in tumors. However, the relation between imprinting 
and gain of chromosomal material is hot known. 

We regard it as a strength of this investigation that we were 
able to compare invasive tumors to benign tumors rather than 
to normal urothelium, as the tumors studied were biologically 
very close and probably may represent successive steps in 
the progression of bladder cancer. Despite the limited amount 
of fresh tissue available it was possible to apply three different 
state of the art methods. The observed correlation between 
DNA copy number and mRNA expression is remarkable when 
one considers that different pieces of the tumor biopsies were 
used for the different sets of experiments. This indicate that 
bladder tumors are relatively homogenous, a notion recently 
supported by CGH and LOH data that showed a remarkable 
similarity even between tumors and distant metastasis (10, 23). 

In the few cases analyzed, mRNA and protein levels 
showed a striking correspondence although in some cases 
we found discrepancies that may be attributed to translational 
regulation, post-translational processing, protein . degrada- 
tion, or a combination of these. Some transcripts belong to 
undertranslated mRNA pools, which are associated with few 
translationally inactive ribosomes; these pools, however, 
seem to be rare (24). Protein degradation, for example, may 
be very important in the case of polypeptides with a short 
half-life (e.g. signaling proteins). A poor correlation between 
mRNA and protein levels was found in liver cells as deter- 
mined by arrays and 2D- PAGE (25), and a moderate correla- 
tion was recently reported by Ideker ef a/. (26) in yeast. 
(Interestingly, our study revealed a much better correlation 
between gained chromosomal areas and increased mRNA 
levels than between loss of chromosomal areas and reduced 
mRNA levels. In general, the level of CGH change determined 
the ability to detect a change in transcript) One possible 
explanation could be that by losing one allele the change in 
mRNA level Is not so dramatic as compared with gain of 
material, which can be rather unlimited arid may lead to a 
severalfold increase in gene copy number resulting in a much 
higher impact on transcript level. The latter would be much 
easier to detect on the expression arrays as the cut-off point 
was placed at a 2-fold level so as not to be biased by noise on 
the array. Construction of arrays with a better signal to noise 
ratio may in the future allow detection of lesser than 2-fold 
alterations in transcript levels, a feature that may facilitate the 
analysis of the effect of loss of chromosomal areas on tran- 
script levels. 
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In eleven cases we found a significant correlation between 
DNA copy number, mRNA expression, and protein level. Four 
of these proteins were encoded by genes located at a fre- 
quently amplified area in chromosome 17q. Whether DNA 
copy number is one of the mechanisms behind alteration of 
these eleven proteins is at present unknown and will have to 
be proved by other methods using a larger number of sam- 
ples. One factor making such studies complicated is the large 
extent of protein modification that occurs after translation, 
requiring immunoidentification and/or mass spectrometry to 
correctly identify the proteins in the gels. 

In conclusion, the results presented in this study exemplify 
the large body of knowledge that may be possible to gather in 
the future by combining state of the art techniques that follow 
the pathway from DNA to protein (26). Here, we used a tradi- 
tional chromosomal CGH method, but in the future high reso- 
lution CGH based on microarrays with many thousand radiation 
hybrid-mapped genes will Increase the resolution and informa- 
tion derived from these types of experiments (2). Combined with 
expression arrays analyzing transcripts derived from genes with 
known locations, and 2D gel analysis to obtain information at 
' the post-translational level, a clearer and more developed un- 
derstanding of the tumor genome will be forthcoming. 
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High-throughput technologies, such as proteomic screening and DNA micro-arrays, produce vast 
amounts of data requiring comprehensive analytical methods to decipher the biologically relevant 
results. One approach would be to manually search the biomedical literature; howeve? this wou^ 
an arduous task. We developed an automated ...erature-mining tool, termed MedGene. whicn 
comprehensively summames and estimates the relative strengths of all human gene-disease 
releuonships in Medline. Using MedGene. we analyzed a ru^veVmicro-array expresston d^ 
comparing breast cancer and normal breast tissue in the context of existing knowledge. We found™ 
correlation between the strength of the literature association and the magnitude of me different ■ 
expression level when considering changes as high as 5-fold; however, a significant correlatlTwas 
observed r - 0.41; p« 0.05) among genes showing an expression differed of no-fold » J 
^teresungly. his only held true for estrogen receptor (ER) positive tumors, not ER negative. MedGene 
Klentified a set of relahveiy understudied, yet highly expressed genes in ER negativetSnors vvonhy of 
runner examination. J 
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Introduction 

At Its current pact, the accumulation of biomedical literature 
outpaces the ability of mosi researchers and clinicians to stay 
abreast of their own Immediate fields, let atone cover a broader 
range of topics. For example, to follow a single disease, e.g.. 
breast cancer, a researcher would have had to scan 130 different 
Journals and read 27 papers per day In 1999. 1 This problem is 
accentuated with high- throughput technologies such as DNA 
micro-arrays and proteomlcs. which require the analysis or 
large datasets Involving thousands of genes, many of which are 
unfamiliar to a particular researcher. In any microarray experi- 
ment, thousands of genes may demonstrate statistically sig- 
nificant expression changes, but only a fraction of these may 
be relevant to the study. The ability to Interpret these datasets 
would be enhanced If they could be compared to a compre- 
hensive summary of what Is known about all genes. Thus, there 
Is a need to summarize existing knowledge In a format that 
allows for the rapid analysis of associations between genes and 
diseases or other specific biological concepts. 

One solution to this problem Is to compile structured digital 
resources, such as the Breast Cancer Gene Database* and the 
Tumor Gene Database. 1 However, as these resources are hand- 
curated. the labor-intensive review process becomes a rate- 
llmltlng step in the growth of the database. As a result, these 
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databases have a limited scale and the genes are not selected 
In a systematic fashion. 

An alternative approach Is automated text mining; a method 
which Involves automated information extraction by searching 
documents for text string* and analyzing their frequency and 
context. This approach has been used successfully In several 
Instances for biological applications. In most cases. It has been 
applied to extract Information about the relationships or 
Interactions that proteins or genes have with one another In 
the literature or by functional annotation.'-' Thus far few 
publication have applied text-mining to examine the global 
relationships between genes and diseases, Perez-Iratxeta et al 
automatically examined the CO (Gene Ontology) annotation 
of genes and their predicted chromosomal locations In order 
to Identify genes linked to Inherited disorders* 

To obtain a more global understanding of disease develop- 
ment, it would be valuable to Incorporate Information regarding 
all possible gene-disease relationships. Including biochemical, 
physiological pharmacological, epidemiological, as well as 
genetic. This information would enable comprehensive com- 
parisons between large experimental datasets and existing 
knowledge In the literature. This would accomplish two things 
first, it would serve to validate experiments by demonstrating 
that known responses occur as predicted. Second, It would 
rapidly highlight which genes are corroborated by the literature 
and which genes are novel In a given context. We have utilized 
a computational approach to literature mining to produce a 
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comprehensive set of gene-disease relationships. In addition, 
we have developed a novel approach to assess the strength of 
each association based on the frequency of citation and co- 
citation. We applied this tool to help Interpret the data from a 
large micro- array gene expression experiment comparing 
normal and cancerous breast tissue. 



Methods 

MedGene Database. MedGene Is a relational database, stor- 
Ing disease and gene Information from NCBI, text mining re- 
sults, statistical scores, and hyperlinks to the primary lit- 
erature. MedGene has a web-based user- interface for users to 
query the database (http://hlpseq.med.harvard.edu/MedCene/). 

Text Muiing Algorithms. MeSH files were downloaded from 
the McSH web site at NLM (Nation Library of Medicine) (http:// 
www.nim.nlh.gov/mesh/meshhome.html) and human disease 
categories were selected LocusUnk files were downloaded from 
the LocusUnk web site at NCBI (http://www.ncbl.njh.gov/ 
LocusUnk/). Official/preferred gene symbol, official/ preferred 
gene name, and gene alternative symbols and names, all 
relevant annotations and URLs for each LocusLlnk record, were 
collected. Cene search terms were used for literature searching 
and Included all qualified gene names, gene symbols, and gene 
family terms. Primary gene keys, predominantly qualified gene 
family terms and gene official/preferred symbols, were used 
to Index Medline records, if the official/preferred gene symbols 
did not meet the standards to be an index, then qualified gene 
official/preferred names were used. A local copy or Medline 
records (up to July. 2002) was pre-selected. 

A JAVA module examined the MeSH terms and then Indexed 
each Medline record with the appropriate disease terms. A 
separate JAVA module was used to examine the titles and 
abstracts for gene search terms and then to Index the gene- 
related Medline records with the relevant primary gene key(s), 
Statistical Methods. For every gene and disease pair, wc 
counted records that were Indexed for both gene and disease 
(double positive hits), for disease only (disease single hits), for 
gene only (gene single hits), and for neither gene nor disease 
(double negative hits) to generate a 2 x 2 contingency tabic. 
On the basis of the contingency table-framework, we applied 
different statistical methods to estimate the strength of gene- 
disease relationships and evaluated the results. These methods 
Included chUsquare analysis. Fisher's exact probabilities, rela- 
tive risk of gene, and relative risk of disease" (http-// 
hlpseq.med.harvard.edu/MedGene^. in addition, wc computed 
the 'product of frequency*, which Is the product of the 
proportion of disease/gene double hits to disease single hits 
and the proportion of disease/gene double hits to gene single 
hits. To obtain a normal distribution, we transformed all the 
statistical scores using the natural logarithm. We selected the 
log of the product of frequency (LPF) to validate MedGene and 
to use for the anarysis with the micro-array data. Spearman 
rank-correlation coefficients were used to assess the linear 
relationship between LPF and micro-array fold chance in 
expression level. * 

Global Anarysis, Diseases with at least 50 related genes were 
selected for clustering analysis, and the LPF scores were 
normalized with total score for each disease. Hierarchical 
clustering was done with the -Cluster software and the 
clustering result was visualized using TreeVlewer" (http // 
ranaJbl.gov/ElsenSoftware.htm). 
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Breast Tissue Micro- Arrays. Eighty- nJne breast cancer 
samples (79% ER-posltlve) and 7 normal breast tissue samples 
were selected from the Harvard Breast SPORE frozen tissue 
repository and were representative of the spectrum of histo- 
logical types, grades, and hormone receptor immunc-pheno* 
types of breast cancer. BioUnylated cRNA, generated from the 
total RNA extracted from the bulk tumor, was hybridized to 
Aflymetrix U95A oligonucleotide micro-arrays. These micro- 
arrays consist of 12 400 probes, which represent approximately 
9000 genes. Raw expression values were obtained using CENE- 
CHfP software from Aflymetrlx, and then further analyzed using 
the DNA-Chlp Analyzer (dChJp) custom software 

Results 

Automated Indexing of Medline Records by Disease and 
Gene, To study the gene-disease associations in the Uterature. 
we first complied complete lists for human diseases and human 
genes. To Index aU Medline records that were relevant to 
human diseases, the Medical Subject Heading (MeSH) Index 
of Medline records was utilized. MeSH Is a controlled medical 
vocabulary from the National Library of Medicine and consists 
of a set of terms or subject headings that are arranged In both 
an alphabetic and an hierarchical structure. Medline records 
are reviewed manually and MeSH terms are added to each with 
software assistance. 810 Twenty-three human disease category 
headings along with all of their child terms (see the Supporting 
Information. Supplemental Table 1, or visit http^/hlpseq 
med.harvardedu/MedGene/publlcatlon/s^Table l.html) were 
selected from the 2002 MeSH Index creating a list of 4033 
human diseases. 

No Index comparable to the MeSH index exists for genes 
and thus. It was necessary to apply a string search algorithm 
for gene names or symbols found in Medline text. A complete 
list of genes, gene names, gene symbols, and frequently used 
synonyms were collected from the LocusLlnk database at 
NCBI, 1 ** 11 which contains 53 259 Independent records keyed 
by an official gene symbol or name (June 18 th , 2002). For the 
purposes of this study, no distinction was made between genes 
and their gene products. Authors often use the same name for 
both, differentiating the two only by the use of Italics, if at ail. 
For the Intended use of this stuoy, this lack of distinction is 
unlikely to have a large effect and may in fact be beneficial. 

Initial attempts to search the literature using these lists 
revealed several sources of false positives and false negatives 
Gable !). False positives primarily arose when the searched 
term had other meanings, whereas false negatives arose from 
syntax discrepancies necessitating the development of filters 
to reduce these errors. The syntax Issues were readily handled 
by Including alternate syntax forms In the search terms The 
raise positive cases, caused by duplicative and unrelated 
meanings for the ierms t were more difficult to manage. Where 
possible, case sensitive string mapping reduced Inappropriate 
citations. In many cases, however, this was not sufficient and 
the terms had to be eliminated entirely, thereby reducing the 
false positive rate but unavoidably under- representing some 
genes. 

For the purposes or data tracking, a primary gene key was 
selected to represent all synonyms that correspond to each 
gene. Medline records were indexed with a primary gene key 
when any synonym for that key was found In the title or 
abstract. Case- insensitive string mapping was used for all 
searches except as noted above. No additional weight was 



Analysis' o/ Data Using Advanced Littrature Mining 

Table 1. Systematic Sources of false Positives end False Negatives in Unfiltered Data* 
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error type 



example 



gene symbol/name false positive 

Is not unique 



gene symbol Is raise positive 

unrelated abbreviation 

gene symbol/name false positive 

has language meaning 

nonstandard syntax false negative 

unofficial gene name/symbol false negative 

jionspeclfted gene name false negative 



fitter solution 



A646-myeitn 

associated glycoprotein 
M4(>-maJignamy-associared 

protein 

£4— pallid homologue (mouse), 

W* 0 abbn?v - for Pennsylvania) 
Jtt^Wukott-Aldrlch Syndrome 

(also the word "was") 
BAG-} Instead of BAC1 
P53 Instead of TPS3 
estrogen receptor instead of 

Estrogen receptor 1 



eliminate thta term 

eliminate thb term 

case-sensitive string search 

add dash term 

add all gene nicknames 

add family stem term 



false negatives are relationship, that an, undi^pre3eo^V£S^^ SrSS U^^T ******* ^"^P* that are not real and 
error. In general error rates maiimfced «er*tMrv. * MM «■ »* aoluttonttself Irtfoduco 

added for multiple occurrences of a term or the co-occurrence 

of multiple synonyms for the same gene key. 
Medline records were searched with all qualified gene 

Identifiers, such as the official/preferred gene symbol, the 

official/preferred gene name, all gene nicknames and all syntax 

variants. In situations where there are several members of a 

gene family or splice variants, some authors prefer to use a 

shortened gene family name, e.g., estrogen receptor ^Instead of 

estrogen receptor 1 [ESRQ. creating a source of false 'negatives. 

For this reason, gene family stem terms were created for all 

genes that have an alpha or numerical suffix (eg., IL2HA, TCFfi, 

ESRl, etc.) and then used to search the literature The family 

stem terms were handled separately from the specific gene 

names so that It would be clear when linkages were made to 

the gene family versus a specific member In that family. 
To improve performance and accuracy, some pre-selectlon 

was applied to the records that were scanned. First, review 

articles were eliminated to avoid redundant treatment of 

citations. Second, non-English Journals were removed because 

the natural language filters were only relevant to English 

publications. Finally, journals unlikely to contain primary data 

about gene-disease relationships were also removed (eg., Int. 
J. Health Educ, Bedside Nurse, and / Health Econ). Together, 

these filters reduced the 1 2 198 22 1 Medline publications flulv 
2002) by 37%. 7 
Ranking the Relative Strengths of Gene Disease Associa- 
tions, In total, there were 618 708 gene-disease co-citatlons 
In which 16% (8297) of ell studied genes had been associated 
lo a disease and 96% (3875) of all diseases liad been associated 
to at least one gene. To rank the relative strengths of gene 
disease relationships, we tested several different statistical 
methods and examined the results. With the exception of the 
relative risk estimates, the methods provided similar results 
with respect to the rank order of the gene-disease association 
strengths. However, after comparing the results to other 
databases and after consulting disease experts, the log of the 
product of frequency (LPF) was selected for further analysis 
because It gave the best results overall. 

Validation of MedGene. In developing this tool. It was 
Important to mlnlmtee the number of missed genes (false 
negatives) and miscalled genes (false positives). However, In 
situations when these goals were In conflict. Incluslveness was 
prioritized. To determine the false negative rate In MedCene. 
breast cancer was used as a test case because It was associated 
with more genes than any other human disease and because 




Figure 1. Estimation of the false negative rata by comparison 
with hend-curated databases. The breast cancellated genes 
Identified by MedGene were compared with those listed In 
several other databases Including the Tumor Gene Database 
(TGO) ' the Breast Cancer Gene Database(BCG), 1 GeneCerds 
(GC)" and Swlssprot" Genes were considered false negatives 
If they were represented In at least one of these other databases 
and not In MedGene and their link to breast cancer was sup- 
ported by at least one literature reference. All literature references 
were verifiod by manual review to confirm their validity The 
number of genes in each database or shared by more than one 
database is indicated. The false negative rate was calculated by 
genes missed at MedGene (2«/total number of nonoverlapping 
genes in other databases (285), 

there were several public databases that link genes to breast 
cancer. We compared the list of breast cancer-related genes 
from MedCene to these databases. Illustrated In Figure 1. 
Among the 285 distinct breast cancer- related genes that were 
supported by at least one literature citation In these hand- 
curated databases. 26 were absent from MedCene. suggesting 
a false negative rate of approximate^ 9%. To determine why 
these were missed all literature references for these genes (80 
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. papers) were reviewed manually (see the Supporting Informa- 
tion. Supplemental Tabic 2. or visit http://hlpseq.med. 
harvard.edu/MedGene/publlcatlon/sjrable 2.html) Among 
these papers, most false negatives were caused by nonstandard 
gene terms or gene terms eliminated by our specificity filters. 
Few genes were missed because they were only mentioned In 
review papers (0.4%) or they appeared only In the body of the 
manuscript but not the abstract or title (1,1%). Of note. 
MedGene identified approximately 2000 additional breast 
cancer-related genes not listed In any other database. 

To assess the false positive error rate, two complementary 
approaches were used: a detailed analysis of one disease and 
a global examination of 1000 diseases. The detailed approach 
examined the fake positive error rate and Its sources, whereas 
the global approach tested whether the overall results made 
biomedical sense. 

Using the LPF. M67 genes related to prostate cancer were 
assembled In rank order. We then retrieved approximately 300 
Medline records each for the highest ranked 100 and the lowest 
ranked 200 genes and manually reviewed the titles and 
abstracts to determine the verity of the association. Nearly 80% 
of the highest ranked 100 genes fell into one of the Ave 
categories that reflect meaningful gene-kJIsease relationships 
(see the Supporting Information, Supplemental Table 3, or visit 
http://hlpseq.med.harvard.edu/MedGene/publlcatlon/ 
sjable 3.html). Among the lowest ranked 200 genes, ap- 
proximately 70% reflected true relationships. Of the 600 records 
reviewed, there were only two In which the association between 
the gene and the disease was described as negative. Both were 
genes with very low scores. In both cases, the authors did not 
argue the absence of any relationship, but rather that a 
particular feature or the gone or protein was not shown to be 
related to human prostate cancer. 1 M4 

The coincidence of some gene symbols with medical ab- 
breviations, chemical abbreviations and biological abbrevia- 
tions resulted In most of the false positives (see the Supporting 
Information. Supplemental Table 4, or visit http://hipse- 
q.med.harvard.edu/MedCenc/publlcatlon/s_Table 4.html), em- 
phasizing the Importance of the filters that were added In the 
search algorithm (Table 1). Without the filters, the false positive 
rate more than doubled, and the false negative rate rose 
dramatically (data not shown). For example, among the papers 
about breast cancer, there were onJy 12 Medline records that 
referred to ESRl and 10 to ESR2 t whereas almost 2000 papers 
mentioned estrogen receptor without specifying ESRl or ESR2, 
this latter group was detected by the family stem term niter. 

To further validate these results, a global analysis of the gene- 
disease relationships described by MedGene was performed. 
For this experiment. It was reasoned that the more closely 
related the diseases are to one another, the more they will be 
related to the same gene sets. Thus. If the relationships denned 
by MedGene accurately reflected the literature, then an unsu- 
pervised hierarchical clustering of the gene data should group 
diseases In a manner consistent with common medical think- 
ing. Conversely, If the clustered diseases do not make sense 
biologically or medically. It may reflect excessive false positives, 
false negatives, or Inappropriate scoring of the data. 

To execute this experiment, the gene sets and the corre- 
sponding LPF values for 1000 randomly selected diseases (each 
with at least 50 gene relationships) were used as a dataset for 
clustering the diseases. A review of the results showed that the 
resulting disease clusters were Indeed logical based upon 
common medical knowledge (see the Supporting information. 
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Supplemental Figure 1. or visit hrtpy/hlpseq.med.harvard.edu/ 
MedGene/pubUcatlon/sJIgure l.html). For example. In one 
such cluster shown in Figure 2, diabetes and Its complications 
grouped together and were also closely linked to diseases 
associated with starvation states. 

The number of genes associated with a given disease can 
be estimated by adjusting the MedGene number up by the false 
negative rate (-9%) and down by the fake positive rate (^26% 
on average). Using this, the average disease has 103.7 ± 45.3 
(mean ± s.d.) genes associated with It, although the range Is 
quite broad with 2359 genes related to breast cancer, 2122 
genes related to lung cancer and no genes related to a number 
of diseases. 

Applying MedGene to the Analysis of Large Datasets. Access 
to a comprehensive summary of the genes linked to human 
diseases provided an opportunity to anaryze data obtained from 
a high-throughput experiment. We compared the MedGene 
breast cancer gene list to a gene expression data set generated 
from a micro-array analysis comparing breast cancer and 
normal breast tissue samples. Micro-array analysis Identified 
2286 genes that had greater than a I -fold difference in mean 
expression level between breast cancer samples and normal 
breast samples. Using MedGene, we sorted the 2286 genes into 
four classes: 555 genes directly linked to breast cancer in the 
literature by gene term search (Orst-degree association by gene 
name); 328 genes directly linked by family term search (flrst- 
degree association by femlry tertn); 1021 genes linked to breast 
cancer onry through other breast cancer genes (second-degree 
association); and 505 genes not prevlousry associated with 
breast cancer, (See the Supporting Information. Supplemental 
Figure 2, or visit http://hlpseq.nied.harvard.edu/MedGene/ 
publlcaUori/sJ?tgure 2.html.) Among the 505 previously un- 
related genes. 467 were either newly Identined genes or genes 
that had not previously been associated with any disease. 
Among the remaining 38 genes. 9 had been related to other 
cancers, specifically esophageal, colon, uterine, skin, and cervix. 

To determine whether the genes highlighted by the micro- 
array analysis were more likely to have been prevlousry linked 
to breast cancer in the literature, we created a two-dimensional 
plot of the foJd change of expression level between breast 
cancer and normal tissue versus the literature score (LPF) 
(Figure 3A). There was a broad spread of expression changes 
among the genes directly linked to breast cancer ranging from 
less than 1-fold change (68%) to over 40-fold (0.3%). Notably, 
the majority or genes with greater than 10-fold expression 
changes were linked to breast cancer by first-degree associa- 
tion. 

Among all 754 genes directly linked to breast cancer In the 
literature, there was no correlation between LPF and micro- 
array fold change (r « 0.018. p-value = 0.62). However, when 
we stratified the analysis based on the magnitude of the fold 
change, wc observed an Increasing trend In correlation (Figure 
3B) suggesting that genes with a more substantial change In 
expression level were more likely to have a stronger association 
In the literature. For genes that had 10-foid change or more In 
expression level, the correlation Increased to 0.41 G>value «= 
0.0^. ^ 

When we evaluated the micro-array data separately for ER 
positive and ER negative tumors, the trend In correlation 
between fold change and literature score was highly dependent 
on estrogen receptor status. Interestingly, there was a similar 
trend In correlation for ER positive tumors, but no trend In 
correlation for ER negative tumors. 
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Figure z. Global, validator, by clustering analysis. 2(A). The cane sets and the corresponding LPF values for 1000 diseases each with 
ai least 50 gens relaUortth.ps, were used in en unsupervised clustering cf the diseases based on thTge™ MttLru w £ 
them. A samp e o the data Is shown here. 2(B). One of the resulting clusters Is shown that corresponds to TooS m 
fcrrns above the line) and starvation states terms (under the line) clustered together. Within ^umuSX^ ^S^SX 



Finally, to validate our findings, we computed similar cor- 
relations between the breast cancer expression data and 
LPF scores generated by MedGene for hypertension, a 



disease unrelated to breast cancer. As expected, we did not 
observe an Increasing trend In correlation for hyperten- 
sion. 
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' Analysis efUaia Using Advanced Litera ture Mining 
TabJo 2, Top 25 Genes Related to Selected Human Diseases* 



research articles 



breast neoplasms 



hypertension 



f beumaioJd arthrtlls 



bipolar disorder 



atherosclerosis 



estrogen receptor 
PGR 

ERBBZ ' 

BRCAl 

BRCA2 

ECM 

CYP19 

TFFJ 

PSEN2 

TPS3 

C£S3 
CEACAMS 



ERBB3 

cydin 
COXSA 
cat heps in 
ERBB4 

TRAM 

CCND1 

ECF 

MUCI 

insulin-like 
BCL2 

mucin 
FGF3 



LEP 
ACT 

INS 

kalllkreln 
ACE 

endotheJln 

S10QA6 

BDK 

DIANPH 
SARI 

,PJH 
CD59 
ALB 

CYPUB2 
MAT2B 
angiotensin 
receptor 

AGTRZ 

NPPA 

LVM 

DBH 
NPY 

POMC 

neuropeptide 



RA 

TNFRSF1QA 
CRP 
AS 
ESRI 

HLA-DRBl 
DRl 

lnterleukin 

INF 

R6 

collagen 
ILIA 

AGR 

TNFRSF12 
JL2 

emu 

H8 

interieukin I 
matrix 

metaltoprotelnase 

Interferon 

CD68 

RA 
1L17 

MMP3 
SE 



ERDAl 

SNAP29 

PFKL 



TRH 

1MPA2 

HTR3A 

DRD3 

REM 

KCNN3 

DRD4 
HTR2C 

RELN 

DBH 

MAOA 

COMT 

HTR2A 

SYWI 

INPPI 

NEDD4L 

FRA13C 

transducer of 

ERBB2 

BAIAP3 

ATP1B3 
PROS 



apolipoprotein 

APOE 

LDLR 

ELN 

ARC! 

APOB 

APOA1 

MSfU 

LPL 

POM 

plasminogen 
activator InhlbUor 
PLC 

vascular cell 

adhesion molecule 

ATOM 

VWF 

INS 

ARG2 

ABCAi 

OLRI 

collagen 

MCP 

lipoprotein 
APOA2 
Intercellular 
adhesion molecule 
RABZ7A 



i disorder, and atherosclerosis, respectively, 
> website (httpy/hlp^med harvard.edu/ 



Discussion 

The Human Genome Project heralded a new era In biological 
■ research where the emphasis on understanding specific path- 
ways has expanded to global studies of genomic organization 
and biological systems. High-throughput technologies can 
provide novel Insight Into comprehensive biological function 
but also introduces new challenges. The utility of these 
technologies Is limited to the ability to generate, analyze, and 
Interpret large gene lists. MedGene. a relational database 
derived by mining the Information In Medline, was created to 
address this need. MedGene users can query for a rank-ordered 
list of human gene-disease relationships (Table 2) for one or 
more diseases. Each entry Is hyperllnked to the original papers 
supporting each association and to other relevant databases. 

MedCene Is an Innovative extension of previous text mining 
approaches. Perez-iratxeta et al. used the CO annotation and 
their chromosomal locations to predict genes that may con- 
tribute to inherited disorders. 8 MedGene takes a broader view 
and includes all diseases and all possible gene-disease relation- 
ships. Furthermore, MedCone utilizes co- citation to Indicate a 
relationship rather than GO annotation, which Is limited to the 
subset or genes that have GO annotation. Our approach Is 
complementary to that taken by Chaussabel and Sher. who 
used the frequency of co-cltcd terms to cluster genes into a 
hierarchy of gene-gene relationships.* 

A unique aspect of this tool Is the ability to assess the relative 
strengths of gene-disease relationships based on the frequency 
of both co-cltatlon and single citation. This presupposes that 
most co-cltatlons describe a positive association, often referred 
to as publication bias" and Is supported by our observations 



that negative associations are rare (Supplemental Table 3: 

http://hlpseq.med.hamrd,edu/MedGene/publlcatlon/s^Ta- 
ble 3.html). Of course, relationships established by frequency 
of co-cltatlon do not necessarily represent a true biological link; 
however, It Is strong evidence to support a true relationship. 

Another Important feature of MedCene Is the Implementa- 
tion of software filters that substantially reduced the error rate. 
We estimate that less than 1 0% of all associations were missed 
and at least 70% of even the weakest associations were real. 
For this study, all of the filters that we applied were general 
ones, e.g., expanding the list of all gene names to address the 
different syntax forms used by different Journals, eliminating 
gene names that correspond to common English words, etc. 
The majority of the remaining search term ambiguities were 
Idiosyncratic and difficult to Identify systematically without 
causing a significant rise In false negatives. Alternative ap- 
proaches, such as the examination of the nearest neighbor 
terms, need to be considered to further reduce the false positive 
rate. 

It Is not uncommon to see expression changes in micro- 
array experiments as small as 2-fold reported In the literature. 
Even when these expression changes are statistically significant, 
it is not always clear Ifthey are biologically meaningful. When 
comparing expression levels of disease to normal tissue, one 
expects an enrichment of known disease-related genes to 
appear In the altered expression group. MedCene provided a 
unique opportunity to test this notion In the context of existing 
knowledge on a novel breast cancer micro-array dataset. For 
genes displaying a 5- fold change or less In tumors compared 
to normal, there was no evidence of a correlation between 
altered gene expression and a known role In the disease. This 
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TabU 3. Genes with Large Expression Changes In ER- but 
Not In ER+ Breast Tumors 



gene symbol 


fold change (BR+) 


fold change (ER-) 


KJRTHBl 


1.0 


610.8 


BRS3 


1.2 


89.4 


DKK1 


1.2 


69.8 


ZIC1 


1.9 


59.6 


TLRl 


1.0 


38.5 


KIAA0680 


2.6 


33.2 


CDKN3 


1.0 


30.6 


EB12 


4.0 


27.9 


CZMB 


3.8 


21.9 


STK18 


4.7 


18.6 


GPR49 


1,0 


14.6 


MYO10 


1.6 


14.4 


LAD1 


-1.0 


13.5 


POLE2 


4.2 


13.0 


HMG4 


4.4 


12.9 


BCLZll! 


-1.2 


12.3 


LRP8 


2.9 


12.2 


CCNB2 


1.0 


118 


CCNE2 


4.0 


11.6 


FCB 


-4.3 


11.1 


KNSL6 


2.9 


10.9 


HIFS 


3.0 


10.2 


SERP1NH2 


4.6 


10.2 


YAP! 


1.0 


10.0 


LPHB 


-U 


-10.4 


TCEA2 


-1.1 


-10,8 


TFFJ 


1.3 


-11.4 


COL17A1 


-4.1 


-15.7 


POPS 


U 


-16.2 


BPAC1 


-4,6 


-22.3 


PDZKJ 


-1.1 


-36.8 


VECFC 


-2.8 


-51.5 


MUC6 


-1.4 


-64.9 


SERP1NA5 


-1.0 


-83.1 


MEJSI 


-1.6, 


-85.9 


CA12 


2.4 


-1S0.3 



Table 3. MedCcne Identified a set of relatively urxterstudled, yet highly 
expressed genes in ER negative, but not ER positive breast tumors, AU of 
these genes have either never been co-cited wUh breast cancer or Itave a 
weak association except those marked with an V 



reflects the many genes whose role In breast cancer may not 
Involve large changes In expression In sporadic tumors (e.g., 
JBRCA1 and BRCA2) and genes whose modest changes In 
expression may be unrelated to the disease. Strikingly, among 
genes with a 10-fold change or more In expression level, there 
was a strong and significant correlation between expression 
level and a published role In the disease, providing the first 
global validation or the micro-array approach to Identifying 
disease-specific genes. 

The results derived from MedGenc have two Implications. 
First, a careful hunt Tor corroborating evidence of a role In 
breast cancer should precede any further study of genes with 
(ess than 5-fold expression level changes. Second, any genes 
with 10-fold changes or more arc likely lo be related to breast 
cancer and warrant attention. It is likely that this threshold will 
change depending on the disease as well as the experiment. 

Interestingly, the observed correlation was only found among 
ER poslrive tumors, noi ER-negative. This may reflect a bias 
In the literature to study the more prevalent type of tumor In 
the population. Furthermore, this emphasizes that caution 
must be taken when Interpreting experiments that may contain 
subpopulatlons that behave very differently. The MedCene 
approach Identified a set of relatively understudied, yet highly 
expressed genes In ER-negative tumors that are worthy of 
further examination (Table 3), 



In conclusion, we have developed an automated method or 
summarizing and organizing the vast biomedical literature. To 
our knowledge, the resulting database Is the most comprehen- 
sive and accurate of Its kind. By generating a score that reflects 
the strength of the association. It provides an Important tool 
for the rapid and flexible analysis of large datasets from various 
high-throughput screening experiments. Furthermore, It can 
be used for selecting subsets of genes for functional studies, 
for building disease- specific arrays, for looking at genes com- 
mon to multiple diseases and various other high-throughput 
applications. In the future. It will be possible to enhance the 
utility of the MedGene database by building links between 
genes and other MeSH terms as well as other biological 
processes and concepts, such as cell division and responses to 
small molecules, 
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ABSTRACT Wnt family members are critical to many 
developmental processes, and components of the Wnt signal- 
ing pathway have been linked to tumorigenesis in familial and 
sporadic colon carcinomas. Here we report the identification 
of two genes, WISP- 1 and WISP-2, that are up-regulated in the 
mouse mammary epithelial ceil line C57MG transformed by 
. Wnt-1, but not by Wnt-4. Together with a third related gene, 
WISP-3, these proteins define a subfamily of the connective 
tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-1. These included (i) C57MG cells infected with a Wnt-1 
retroviral vector or expressing Wnt-1 under the control of a 
tetracyline repressible promoter, and (u) Wnt-1 transgenic 
mice. The WISP-l gene was localized to human chromosome 
8q24.1-8q24J. WISP-l genomic DNA was amplified in colon 
cancer cell lines and in human colon tumors and its RNA 
7, overexpressed (2- to > 30-fold) in 84% of the tumors examined 
compared with patient-matched normal mucosa. WISPr3 
mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to >40-fold) in 63% of the colon tumors analyzed. 
In contrast, WISP-2 mapped to human chromosome 20ql2- 
20ql3 and its DNA was amplified, but RNA expression was 
reduced (2- to > 30-fold) in 79% of the tumors. These results 
^suggest that the WISP genes may be downstream of Wnt-1 
signaling and that aberrant levels of WISP expression in colon 
^cancer may play a role in colon tumorigenesis. 



Wnt-1 is a member of an expanding family of cysteine-rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
v ■ ^ • Wnt-1 originally was identified as an oncogene activated by 
^isi^Xthe. insertion of mouse mammary tumor virus in virus-induced 
^"^^'Inammary adenocarcinomas (3/4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the^seven-transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
<v; - . (Dsh) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally constitu lively active glycogen 
synthase kinase-3/3 (GSK-3/3) resulting in an increase in 
/3-catenin levels. Stabilized 0-catenin interacts with the tran- 
scription factor TCF/Lef 1, forming a complex that appears in 
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the nucleus and binds TCF/Lefl target DNA elements to 
activate transcription (7, 8). Other experiments suggest that 
the adenomatous polyposis coli (A PC) tumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
/3-catenin levels (9). APC is phosphorylated by GSK-3/3, binds 
to /3-catenin, and facilitates its degradation. Mutations in 
either APC or /3-catenin have been associated with colon 
carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated downstream components activated by 
Wnt have been characterized. Those that have been described 
cannot account for all of the diverse functions attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal-related 3 gene, Xnr3, a member of 
the transforming growth factor (TGF)-/3 superfamily, and the 
homeobox genes, engrailed, goosecoid, twin (Xtwn), and siamois 
(2). A recent report also identifies c-myc as a target gene of the 
Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) (11), using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a" 
partially transformed phenotype, characterized by elongated 
and refractile cells that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed cells, WISP-l 
and WISP-2, and a third related gene, WISPS. The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
/lov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Select cDNA 
Subtraction Kit (CLONTECH). Tester double-stranded. 

Abbreviations: TGF, transforming growth factor; CTGF, connective 
tissue growth factor, SSH, suppression subtractive hybridization; 
VWC, von Willebrand factor type C module. 
Data deposition: The sequences reported in this paper have been 
deposited in the Genbank database (accession nos. AF1 00777, 
AF100778, AF100779, AF100780, and AF100781). 
*To whom reprint requests should be addressed, e-mail: diane@gene. 
com. 
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tDN.A was synthesized from 2 jig of poly(A) + RNA isolated 
from the C57MG/Wnt-1 cell line and. driver cDNA from 2 fig 
of poly(A) + RNA from the parent C57MG cells. The sub- 
tracted cDNA library was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening. Clones encoding full-length 
mouse WISP-1 were isolated by screening a AgtlO mouse 
embryo cDNA library (CLONTECH) with a 70-bp probe from 
the original partial clone 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human WISP-1 
were isolated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
coding full-length mouse and human WlSP-2 were isolated by 
screening a C57MG/Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding WISPS were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA. PCR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLONTECH) and 300 yM of each dNTP at 
94°C for 1 sec, 62°C for 30 sec, 72°C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3-phosphate dehydrogenase primer 
sequences are available on request. 

In Situ Hybridization. 33 P-Iabeled sense and antisense ribo- 
probes were transcribed from an 897-bp PCR product corre- 
sponding to nucleotides 601-1440 of mouse WISP-1 or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WISP-2. All tissues were processed as described (40). 

Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-amplified, and the results 
were submitted to the Stanford or Massachusetts Institute of 
v - Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Leeds, United Kingdom. Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: SW480, COLO 320DM, HT-29, 
^u^wWiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
r /~ f carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
ydl^ HM7 ( a variant pf ATCC colon adenocarcinoma cell line LS 
^5^0&174T). DNA concentration was determined by using Hoechst 
dye 33258 intercalation fluorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCl cushions or prepared by using RNAzol. 

Gene Amplification and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs and c-myc in 
the cell lines, colorectal tumors, and normal mucosa were 
^j^^^etermined by quantitative PCR. Gene-specific primers and 
^ "^ fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2< Act> where ACt represents the difference in amplification 
cycles required, to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
ir^v tumor RNA compared with normal mucosal RNA. The 
a-method was used for calculation of the SE of the gene copy 
number or RNA expression level The WTCP-specific signal was 
normalized to that of the glyceraldehyde-3-phosphate dehy- 
drogenase housekeeping gene. All TaqMan assay reagents 
were obtained from Perkin-Elmer Applied Biosystems. 

RESULTS 

Isolation of WISP- 1 and WISP-2 by SSH. To identify Wnt- 
U", 1 -inducible genes, we used the technique of SSH using the 



mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express WnM (11). Candidate differentially ex- 
pressed cDNAs (1,384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
lbgues,.32% matched expressed sequence tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/Wnt-1 cells. 

Two of the cDNAs, WISP-1 and WISP-2. were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. 1 A and B). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG cells and has no 
effect on /3-catenin levels (13, 14). Expression of WISP-I was 
up-regulated approximately 3-fold in the C57MG/ Wnt-1 cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnt-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of WnM in the 
repressed state but show a strong induction of Wnt-1 mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various times after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNA (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr, the 
induction of WISPs may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-1 were isolated and the 
sequence compared with mouse WISP-1. The cDNA sequences 
of mouse and human WISP-1 were 1 ,766 and 2,830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of ^40,000 (Af r 40 K). Both have 
hydrophobic N-terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-Iinked glycosylation sites" 
and are 84% identical (Fig: 24). - 

Full-length cDNA clones of mouse and human WISP-2 were 
. . . . .1,734 and 1,293 bp in length, respectively, and encode proteins * 
of 251 and 250 aa, respectively, with predicted relative molec- '*"* " 
ular masses of ^27,000 (M t 27 K) (Fig. IB). Mouse and human 
WISP-2 are 73% identical. Human WISP-2 has no potential 
N-linked glycosylation sites, and mouse WISP-2 has one at 



C57MG 

I — -I 

Parent . Wnt-1 Wnt-4 




Fig. 1. WISP-1 and WISP-2 are induced by WnM, but not Wnt-4, 
expression in C57MG cells. Northern analysis of WISP- J (A) and 
WISP-2 (B) expression in C57MG, C57MG/Wnt-1, and C57MG/ 
Wnt-4 cells. Poly(A)* RNA (2 fig) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse WISP- /-specific probe 
(amino acids 278-300) or a 190-bp W/SP-2-specific probe (nucleotides 
1438-1627) in the 3' untranslated region. Blots were rehybridized with 
human 0- act in probe. 
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Fig. 2. Encoded amino acid sequence alignment of mouse and 
human WISP-l (A) and mouse and human WISP-2 (B). The potential 
signal sequence, insulin-like growth factor-binding protein (IGF-BP), 
VWC, thrombospondin (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WISP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISP-1. 

Identification of WISPS. To search for related proteins, we 
screened expressed sequence tag (EST) databases with the 
WISP-1 protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WISP-3 cDNA of 1,371 bp was isolated corresponding to those 
ESTs that encode a 354 : aa protein with a predicted molecular 
mass of 39,293. WISP-3 has two potential N-iinked glycosyl- 
ation sites and 36 cysteine residues. An alignment of the three 
human WISP proteins shows that WISP-1 and WISP-3 are the 
vr^^s most similar (42% identity), whereas WISP-2 has 37% identity 
.'J-S, ;-;*.. with WISP-1 and 32% identity with WISP-3 (Fig. 3/1). /" 
; v-V; WISPs Are Homologous to the CTGF Family of Proteins. 
^k^'^^W^-yWf^:^ WISP-2, and WISP-3 are novel sequences; 

however,* mouse" WISP-1 is the same as the recently identified 
Elml gene. Elml is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP-2 are homologous to the recently 
described rat gene, rCop-1 (16). Significant homology (36- 
44%) was seen to the CCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the protoonco- 
gene nov. CTGF is a chemotactic and mitogenic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-/3 (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
functional, but not sequence, similarity to Wnt-L All are 
secreted, cysteine-rich heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 35) (21). The N-terrainal domain, which includes the first 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor (IGF)- 
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Fig. 3. (A) Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WI$P-1 and WISP-2 that are not 
present in WISP-3 are indicated with a dot (B) Schematic represen- 
tation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PCR was performed on human 
multiple-tissue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 



binding proteins (BP). This sequence is conserved in WISP-2 
arid WISP-3, whereas WISP-1 has a glutamine in the third .^J^ 
position instead of a glycine. CTGF recently has been shown ***** 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWC), also found in certain 
collagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
. oligomerization (24). The VWC domain of WISP-3 differs 
from all CCN family members described previously, in that it 
contains only six of the 10 cysteine residues (Fig. 3 A and B). 
A short variable region follows the VWC domain. The third 
module, the thrombospondin (TSP) domain is involved in 
binding to sulfated glycoconjugates and contains six cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
present in all CCN family members described to date but is 
absent in WISP-2 (Fig. 3 A and B). The existence of a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown). 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PCR 
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analysis on adult and fetal multiple tissue cDNA panels. 
WISP-1 expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
• Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WlSP-2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant expression of WISP-3 was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISP -3 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In Situ Localization of WISP-1 and WISP-2. Expression of 
WISP-1 and WISP-2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WISP-1 was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D). However, low- 
level WISP-1 expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-1, WISP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H). However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 




Fig. 4. {A y C, £, and G) Representative hematoxylin /eosin-stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WISP-1 expression are shown in B and 
D. The tumor is a moderately well-differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power (A and B), 
expression of WISP-1 is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and D), and tumor cells are negative. 
Focal expression of WISP-1, however, was observed in tumor cells in 
some areas. Images of WISP-2 expression are shown in E-H. At low 
power (E and F), expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher magnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and H). 



the predominant cell type expressing WISP-1 was the stromal 
fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes was determined 
by radiation hybrid mapping panels. WISP-1 is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.31] on chromosome 8q24.1 to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-myc (28). Preliminary fine 
mapping indicates that WISP-1 is located near D8S1712 STS. 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISPS mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(lod = 1,000). WISP-3 is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression of WISPs in Human 
Colon Tumors. Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig^ 
nificance. For example, in a variety of tumor types, c-myc 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-1 resides in the same 
general chromosomal location (8q24) as c-myc, we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-myc locus. 
Genomic DNA from human colon cancer cell lines was 
assessed by quantitative PCR and Southern blot analysis. (Fig. 
5 A and B). Both methods detected similar degrees of WISP-1 
amplification. Most cell lines showed significant (2- to 4-fold) 
amplification, with the HT-29 and WiDr cell lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- - 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-myc gene is not part of the amplicon that 
involves the WISP-1 locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA ' 'f 

was compared with pooled normal DNA from 10 donors by 
quantitative PCR (Fig. 6). The copy number of WISP-1 and 
WISP-2 was significantly greater than one, approximately 
2-fold for WISP-1 in about 60% of the tumors and 2- to 4-fold . 
for WISP-2 in 92% of the tumors (P < 0.001 for each). The 
copy number for WISP-3 was indistinguishable from one (P = ^„ Wi 
0.166). In addition, the copy number. of WISP-2 was sign ifi- v " Va- 
cantly higher than that of WISP-1 (P < 0.001). ^ V ;,\V V 

The levels of WISP transcripts in RNA isolated i from'T9^^'^^l&^ 
adenocarcinomas and their matched normal mucosa were 




Fig. 5. Amplification of WISP-1 genomic DNA in colon cancer cell . 
lines. (A) Amplification in cell line DNA was determined by quanti- ... . 
tative PCR. (B) Southern blots containing genomic DNA (10 fig) 
digested with EcoRl (WISP-1) or Xbai (c-myc) were hybridized with 
a 100-bp human WISP-1 probe (amino acids 186-219) or a human 
c^myc probe (located at bp 1901-2000). The WISP and myc genes are . * o • '. 
detected in normal human genomic DNA after a longer film exposure, i". ' V — 
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Fig. 6. Genomic amplification of WISP genes in human colon 
tumors. The relative gene copy number of the WISP genes in 25 
adenocarcinomas was assayed by quantitative PCR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means ± SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PCR (Fig. 7). The level of WISP-1 
RNA present in tumor tissue varied but was significantly 
increased (2- to >25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10-fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WISP-2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP-1, WISP-3 RNA was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 
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FfG. 7. WISP RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PCR. The Dukes stage of the tumor is listed under the 
sample number. The data are means ± SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 



mucosa. The amount of overexpression of WISPS ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in. gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-state mRNA levels will differ between normal and 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed by 
Wnt-1. 

Three of the genes isolated, WISP-1, WISP-2. and WISP-3, 
are members of the CCN family of growth factors, which 
includes CTGF, Cyr61, and nov y a family not previously linked 
to Wnt signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-1. 
The first was C57MG cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing Wnt-1 under the control of 
a tetracyline-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses Wnt-1, 
whereas, normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISPs in that in these two situations, 
WISP induction was correlated with Wnt-1 expression. 

It is not clear whether the WISPs are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i.e., /3-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-1 -transformed cells, hours 
or days after Wnt-1 transformation. Thus, WISP expression .^..v^*^*, 
could result from WnM signaling directly through 0-catenin ^ifrj^^tr* 
transcription factor regulation or alternatively through Wntrl . r : 
signaling turning on a transcription factor, which in turn ~?^irrb"? 
regulates WISPs. *- ? :- 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WISP-2 is the absence of a CT domain, 
which is present in CTGF, Cyr61, nov, WISP-1, and v WISP^3t^^^3^a; 
This domain is thought to be involved in receptor binding and ^^g^, 
dimerization. Growth factors, such as TGF-/3, platelet^eqveS'^^^^ 
growth factor, and nerve growth factor, which contain a cystine^&^^^li 
.-**■ knot motif exist as dimers (32). It is tempting to speculate^that^ 
WISP-1 and WISP-3 may exist as dimers, whereas WISP-2" ' 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov. A recent report has shown that integrin a v /3 3 serves as 
. an adhesion receptor. for £yf 61 (33) v ^^v^-^ff^^^^ 

The strong expressiori~of WISP-1 and WISP-2 in cells l^rf*—^^ 
within the fibrovascular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of. 
connective tissue stroma in mammary tumors by a cascade of.-W^r^^-V-^ 
growth factor signals similar to that controlling connective 
tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory cells at the tumor 
interstitial interface secrete TGF-/31, which is the stimulus for 
stromal proliferation (34). TGF-/31 is secreted by a large * v/ ; '^; -; : 
percentage of malignant breast tumors and may be one of the ^ > " 
growth factors that stimulates the production of CTGF and . 
WISPs in the stroma. 

It was of interest that WISP-1 and WISP-2 expression was/^:-^^ ; 
observed in the stromal cells that surrounded the tumor &lls^J^i^^ 
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'(epithelial cells) in the Wnt-1 transgenic mouse sections of 
-breast tissue. This finding suggests that paracrine signaling 
could occur in which the stromal cells could supply WISP-1 and 
•WISP-2 to regulate tumor cell growth on the WISP extracel- 
lular matrix. Stromal cell-derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WISP-1 
and WISP-2 in the stromal cells of breast tumors supports this 
paracrine model. 

An analysis of WISP-1 gene amplification and expression in 
human colon tumors showed a correlation between DNA 
amplification and overexpression, whereas overexpression of 
W1SP-3 RNA was seen in the absence of DNA amplification. 
In contrast, WISP-2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression in normal 
colonic mucosa from the same patient. The gene for human 
WISP-2 was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WISP-2 may be caused by another gene in this 
amplicon. 

A recent manuscript on rCop-1, the rat orthologue of 
WISP-2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP-2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis coli and /3-catenin (39). Mutations in specific regions 
of either gene can cause the stabilization and accumulation of 
cytoplasmic 0-catenin, which presumably contributes to hu- 
man carcinogenesis through the activation of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces tumorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down-^ 
stream of Wnt-1 in C57MG cells suggests they could be 
important mediators of Wnt-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. 
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ABSTRACT . The consistent cytogenetic translocation of 
chronic myelogenous leukemia (the Philadelphia chromosome, 
Ph 1 ) has been observed in cells of multiple hematopoietic 
lineages. This translocation creates a chimeric gene composed 
of breakpoint-duster-region (bcr) sequences from chromosome 
22 fused to a portion of tjie abl oncogene bp chromosome 9. The 
resulting gene product (P210 c * bl ) resembles the transforming 
protein of the Abelson murine leukemia virus in its structure 
and tyrosine kinase activity. P^IO**" Is expressed in Ph*« 
positive cell lines of myeloid lineage and in clinical specimens 
with myeloid predominance. We show here that Epsteln-Barr 
virus-transformed B -lymphocyte lines that' retain Ph 1 can 
express PHO 0 "** 1 , The level of expression in these B-cell lines is 
generally lower and more variable than that observed for 
myeloid lines. Protein expression is not related to amplification 
of the abl gene but to variation in the level of bcr-abl mRNA 
produced from a single Ph 1 template. 

Chronic myelogenous leukemia (CML) is a disease of the 
pluripotent stem ceil (1). In greater than 95% of patients, the 
leukemic cells contain the cytogenetic marker known as the 
Philadelphia chromosome, or Ph 1 (2). This reciprocal 
translocation event between the long arms of chromosomes 
9 and 22 has been used as a disease-specific marker for 
diagnosis and evaluation of therapy. Multiple hematopoietic 
lineages, including myeloid and B -lymphoid, contain Ph 1 in 
early or chronic phase, as well as in the more acute accel- 
erated and blast crisis phases of the disease. • 

One molecular consequence of Ph 1 is the translocation of 
the chromosomal arm containing the c-abl gene oh chromo- 
some 9 into the middle of the breakpoint-cluster region (bcr) 
gene on chromosome 22 (3-6).' Although the precise 
translocation breakpoints are variable, an RNA-splicing 
mechanism generates a very similar 8-kilobase (kb) mRNA in 
each case (5-9). The hybrid bcr-abl message encodes a 
structurally altered form of the abl oncogene product, called 
P2|(jc-abi (10-13), with an amino-terfninal segment derived 
from a portion of the exbns of bcr on chromosome 22 and a 
carboxyl-terminal segment derived from a major portion of 
the exohs of the c-abl gene on chromosome 9. The chimeric 
structure of bcr-abl and the resulting P210 < 5' abl is similar to the 
structure of the Abelson murine leukemia -virus gag-dbl 
genome and resulting P1607** 6 ? ti^sforming gene product. 
Both proteins have very similar tyrosine kinase activities (10, 
11, 14) which can be distinguished by their relative stability 
to denaturing detergents and by their ATP requirements from 
the recently described tyrosine kinase activity of. the c-abl 
gene product (15). 
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In concert with structural :modiftcation of the amino- 
terminal portion of the abl gene, increased level of expression 
has been implicated in activation of c-abl oncogenic poten- 
tial. Myeloid and erythroid cell lines and clinical samples 
derived from acute-phase CML patients contain about 10- 
fold higher levels of the 8-kb bcr-abl mRNA* and P210 c - abl than 
the c : a bl mRNA forms (6 and 7 kb) and ?U5 c " bl gene product 
(5, .8, 9, li),!The higher level of expression of the chimeric 
bcrrabl message in acute-phase cells is not iikely to be solely 
due to the presence of the 6cr promoter sequences at the 5* 
end of the gene, since the normal 4.5rkb and 6.7-kb bcr- 
encoded mRNA. species are expressed at an even lower level 
than the normal c-abl messages (5, 6).. 

We have analyzed a series. of xEpstein-Barr virus-immor- 
talized B-lymphpid pell lines derived from CML patients (16). 
With such in vitro clonal cell lines, we can evaluate whether 
the presence of Ph 1 always results in synthesis of the chimeric 
bcr-abl message and protein, and whether the quantitative 
expression varies for cells of B-lymphoid lineage as com- 
pared to previously examined myeloid cell lines. Our results 
show that cell lines that retain Ph* do express bcr-abl message 
and protein, but that the level is generally lower and more 
variable than previously seen for myeloid' cell lines. The 
demonstration that the Ph. 1 chromosomal template can vary 
in its level of expression of P210 c **^ suggests that secondary 
mechanisms, beyond the translocation itself, contribute to 
the regulation of the bcr-abl gene in different cell types or 
subclones that derive from the affected stem cell. 

MATERIALS AND METHODS 

Cells and Cell Landings. Epstein-Barr virus-transformed 
B-lymphoid cell lines were established from peripheral blood 
samples of chronic- and acute-phase CML patients as report* 
ed (16). The cell lines are designated according to patient 
number, karyotype, . and lineage. For example, SK- 
CML7Bt(9,22)-33 refers to CML patient 7, B-lymphoid cell 
line, 9;22 translocation (Ph 1 ), cell line 33; and SK-CML7BN- 
2 refers to B-cell line 2 with a normal karyotype derived from, 
the same patient. Repeat karyotype analysis was performed 
to verify the retention of Ph 1 just prior to analysis for abl 
protein and RNA. Cells were maintained in RPMI.1640 
medium with 20% fetal bovine serum. We have not observed 
any consistent pattern of In vitro growth rate that correlates 
to the stage of disease at the time of transformation with 
Epstein-Barr virus. Cells (1.5 x 10 7 ) were washed twice with 
Dulbecco's modified Eagle's medium lacking phosphate and 

Abbreviations: bcr t breakpoint-cluster region^ CML, chronic 
myelogenous leukemia; kb, kflobase(s). 

♦Present address: Department of Genetics, University of Washing- 
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supplemented with S% dialyzed fetal bovine serum. Cells 
were then resuspended in 2 ml of the minimal medium. 
Labeling was started with the addition of [ 32 P]orthophos- 
phate (1 mCi/ml; ICN; 1 Ci = 37 GBq) and continued at 37°C 
for 3-4 hr. 

Imrmroopredpitation and Immunoblotting.. Immunoprecip- 
Rations were carried out as described (10). Cells (1.5 X JO 7 ) 
were washed with phosphate-buffered saline and extracted 
with 3-5 ml of phosphate lysis buffer (1% Triton X-100/0.1 
NaDodSO</0.5% deoxycholate/10 mM Na 2 HPO<, pH 7.5/ 
100 mM NaCl) with 5 mM EDTA and 5 mM phenylmethyl- 
sulfpnyi fluoride. Extracts were clarified by centrifugation 
and precipitated with normal or rabbit anti-abl sera (anti- 
pEX-2 or antj-pEX-5) (17). The precipitated proteins were 
electrophoresed in a NaDodS0 4 /8% polyacrylamide gel. 
32 P-Iabeled proteins were detected by autoradiography. 
Alternatively, abl proteins were detected by immunoblotting. 
Extracts from unlabeled ceils were clarified, and proteins 
were concentrated by imrnunoprecipitation with rabbit anti- 
sera against cW-encoded proteins [anti-pEX-2 and anti-pEX- 
5 combined (17)] and then fractionated in 8% acrylamide gels. 
The proteins were transferred from the gel to nitrocellulose 
filters, using protease-facilitated transfer (18). The abl- 
encoded proteins were detected using murine monoclonal 
antibodies as a probe and peroxidase-conjugated goat anti- 
mouse second stage antibody (Bio-Rad) for development. 
Rabbit antisera and mouse monoclonal antibodies to abl 
proteins were prepared using bacterially expressed regions of 
the v-flW protein as immunogens (17, 19), Anti-pEX-2 anti- 
bodies react with the internal tyrosine kinase domain and 
anti-pEX-5 antibodies react with the carboxyl-terminal seg- 
ment of the abl proteins. 

RNA Analysis. RNA was extracted from 10 8 cells by the 
NaDodSO^/urea/phenql method (20). Poiyadenylylated 
R^NA was purified by oligo(dT) affinity chromatography. 
Samples were electrophoresed in a 1% agarose/formalde- 
hyde gel and transferred to nitrocellulose, abl RNA species 
were detected by hybridization with a nick-translated v-abl 
fragment probe (21). 

DNA Analysis. DNA was prepared from 5 x 10 7 cells of 
each cell line and processed for Southern blots with z v-abl 
probe as described (21). 

RESULTS 

Variable Levels of PHO*"" Are Detected in Ph^Posltive Cell 
Lines. Ph^positive and Ph 1 -negative, Epstein-Barr virus- 
transformed B -lymphocyte cell lines derived from the same 
patient were examined for P210 c ** w synthesis by immuno- 
precipitation of [ 3i P]orthophosphate-labeled cell extracts 
with anti-abl sera (Fig. 1). The normal c-abl protein P145 c " tbl 
was detected at a similar level in multiple Ph^positive arid 
Ph l -negative cell lines. P210 c - aW was only detected in the 
Ph 1 -positive cell lines because the bcr-abl chimeric gene 
which encodes P210 c ^ 1 resides on the Ph 1 (4,'5, 11, 13). The 
level of P210 c * abl was about 4- to 5-fold higher than the level 
of P145 C ^ M in the SK-CML7Bt-33 cell line (Fig. 1A, +). The 
Ph 1 -positive erythroid-progenitor cell line K562 (C) showed 
a level of P2l6 c kbl about 10-fold higher than P145 c - bl . 
However, the level of P210 c -* bl was about one-fifth that of 
P145 Mbl in the. Ph l -positiye SK-CML16BM cell line (Fig. IB, 
+). Comparison of different autoradiographic exposures 
roughly indicated that the level of P210 c "* bI varies over a 
20 : fold range between these Ph^positive B-cell lines. . Anal- 
ysis of four additional Ph l -positive B-cell lines demonstrated 
that the level of P210 c ** w fell into two general classes; some 
cell lines had a level of P210 c "* bl similar to SK-CML7Bt-33 
and others had the low level similar to SK-CML16BM (Table 
1). This differs from previous studies with Ph ! -positive 
myeloid cell lines and patient samples derived from acute- 
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Fio, 1. Detection of variable levels of P210^ w in Pb^positive 
B-cell lines. Production of P145 t4u and P210 c - W in Epstein-Ban* 
virus-transformed B-cell lines derived from a blast-crisis (A) and a 
chronic-phase (£) CML patient was examined by metabolic labeling 
with P l P]orthophosphate and imrnunoprecipitation. Ph --negative 
(-) and Ph'-positive (+) cell lines derived from each patient were 
analyzed. The Ph--negative cell line in A,- is SK-CMJ-7BN-2 and in 
is SK-CML16BN-1. The Ph l -positive cell line in A,+ is 
SK-CML7Bt-33 and in is SK-CML16Bt-l. The K562 cell line, a 
Prepositive erythroid progenitor cell line spontaneously derived 
from a blast-crisis patient (33), is represented in C. Cells (1.5 x 10 7 ) 
were metabolically labeled with 2 mCi of [ ,2 P]orthophosphate for 3-4 
hr and then were extracted and clarified by centrifugation. Samples 
were immunpprecipitated with control normal serum (lanes 1), 
anti-pEX-2 (lanes 2), or anti-pEX-5 (lanes 3) and analyzed by 
NaDodS0 4 /8% PAGE followed by autoradiography with an inten- 
sifying screen (3 days for A and C, 10 days for B). 

phase CML patients, in which P210 c * abI was detected at a 
10-fold higher level than PHS^" 14 (refs. 10 and 11; Table 1). 
There was no large difference in level of chimeric mRNA and 
P210 c ** bl expressed in four myeloid/erythroid-iineage Ph 1 * 
positive cell lines (K562, EM2, EM3, Q*L22 f and BV173; 
refs. 9 and 11), despite a 4- to 5-fold amplification of 
a6/-related sequences in the K562 cell line. 

Detection of different levels of P210 Mlbl in Fig. 1 could be 
due to decreased phosphorylation of P210 c "* bl , a lower level 
of P210 c '* bl synthesis, or altered stability of the protein. To 
help distinguish among these possibilities, the steady-state 
level of P210 c " abl in the cell lines was assayed by immuno- 
blotting. The results show that SK-CML7Btr33 (Fig. 2A, +) 
had a higher level of P210 c "* w than P145, similar to the results 
with metabolic labeling (Fig. 1). We did not detect P210 c -* bl 
by immunoblotting with 2 x 10 7 cells of line SK-CML8BU3 
(Fig. 2B, +). Reconstruction experiments using dilutions of 
cell extracts showed that we could detect about 5-10% the 
level of P210* aW expressed in the K562 cell line (data not 
shown). We infer that the steady-state level of P210 c - abl in 
SK-CML8Bt-3 is lower than the level in SK-CML7Bt-33 by 
a factor of at least 10. The level of P210 c ** w detected in these 
assays correlated with the amount of P210 c " abl tyrosine kinase 
activity that could be detected in vitrp (data not shown). 

Different Levels of P210^ bl Are Reflected .In the Amount of 
Stable bcr-abl mRNA. To identify the basis for detection of 
variable levels of P210 c * abl , we examined the production of 
the abl RNA. RNA blot hybridization analysis using a v-cW 
probe (Fig. 3) showed that the normal 6- and 7-kb c-aW 
mRNAs were present at a similar level in Ph 1 -positive and 
-negative xell lines derived from different patients. However, 
the 8-kb mRNA that encodes P210 c "* w was detected at a 
Itf-fold higher level in SK-CML7Bt-33 (Fig. 3A, +) than in 
SK-CML16BM {B t +), which correlated with the relative 
level of P210 c * ebl detected in each cell line. Analysis of 
additional cell lines demonstrated that the level of 8-kb RNA 
directly correlated with the level of P210 c - fthl (Table 1). The 
. variation in level of 8-kb RNA detected in these cell lines was 
not due to loss or gain of Ph 1 , because cytogenetic analysis 
confirmed the presence of Ph 1 in these cell lines (ref. 16 and 
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Table 1. Relative levels of bcr-abl expression in Epstei&rBarr. 



Proc. Natl. Acad. ScL USA 83 (1986) 4051 



Cell line* 


CML phased 


Ph l * 


P210* 


• O-KD 
_1>XT A 

mKr\A». 


SK-CML7BN-2 


BC 


- 


•- ' 


_ - 


SK-CML8BN-10 


Chronic 




- 


- 


SK-CMUBN-12 


Chronic . 


.r. 


- 


- '. 


SK-CML16BN-1 


Chronic 








SK-CML35BN-1 


Chronic 








SK-CML7B5-33 


BC 


+ 


+ + + 


• +++ 


SK-CML21Bt-l 


Acc 




+ + + • 


• ++ + 


SK-CMLHBt-6 


Acc 


+ 


++*+ 


+++. 


SK-CML8Bt-3 


Chronic 


+ 


+ . 


± "* 


SK-CML16BM 


Chronic 


+ 




+ 


SK-CMU5Bt-2 . 


Chronic 


+ 


• + 


'+ ' 


K562 


BC 


+ ■. 


+ + + ++ 


•+++++ 


BV173 


BC 


. + . 


+ + + ++ 


+++ + + 


EM2 


. BC 


+ 


+ + + + + 


+ + + + + 



•Cell lines derived from CML patients by transformation with 
Epstein-Barr virus as described (16)- Names of cell lines indicate 
patient number and Ph l status: SK-CML7Bt indicates a cell line 
derived from patient 7 that carries the 9$2 Ph 1 translocation; N 
indicates a normal karyotype. Myeloid-erythroid cell lines (K562, 
EM2. and BV173) are described in previous publications (9, 11, 22, 

^Status of patient at the time cell line was derived. BC, blast crisis; 
Acc, accelerated phase. 

^Presence (+) or absence (-) of Ph 1 as demonstrated by karyotypic 
or Southern blot analysis. 

*P210 c "* bl detected as described in legend to Fig. 1. B-ceU lines 
derived from blast-crisis and accelerated-phase patients had levels 
of P210 3- to 5-fold higher (+++) than levels of P145, Chronic- 
phase-derived cell lines had P210 levels lower than or just equivalent 
(+) to the level of P145. Myeloid and erythroid lines had levels of 
P210 5- to 10-fold higher than P145 (+ + + + +). 

tEight-kilobase bcr-abt mRNA detected as described in legend to 
Fig. 2. Symbols: ± t borderline detectable; .+++ + + , level of 8-kb 
mRNA 5- to 10-fold higher than that of the 6- and 7-kb c-cW mRNA 
species; +++ , level of 8-kb mRNA 3- to 5-fold higher than that of 
the 6- and 7-kb species; + , a level approximately equivalent to that 
of the 6- and 7-kb messages. 

data, not shown). There was no difference in the copy number 
of afc/-related sequences as judged by Southern blot analysis 
(Fig. 4). Only the K562 cell line control showed an amplifi- 
cation of abl sequences, as previously reported (22, 23). 
These combined data suggest that differential bcr r M mRNA 
expression from a single gene template is responsible for the 
variable levels of FHO 0 ** 1 detected. This could be mediated 



— P210 



— P145 



Fio. 2. Analysis of steady-state abl protein levels by immuno- 
blotting. Cell extracts prepared from 2 x 10 7 cells of lines SK- 
CML7BN-2 (A,-), SK-CML7BI-33 (A,+), SK-CML8BN-10 (B,-), 
and SK-CML8Bt-3 (5,+) were, concentrated by immunoprecip- 
itation with anU-pEX-2 plus anti-pEX-5. Samples were then electro- 
phoresed in a NaDodS0 4 /89& polyacrylamide gel and transferred to 
nitrocellulose, using protease-facilitated transfer (18). abl proteins 
were detected using a mixture of two monoclonal antibodies directed 
against the pEX-2 and pEX-5 aW-protein fragments produced in 
bacteria (19) as a probe and a peroxidase-conjugated goat anti-mouse 
second-stage antibody (Bio-Rad) for development. 





BR 
£K« 
«Hv& ■ 

Fig. 3. Comparison of abl RNA levels in Ph ! -positive and 
-negative B-cell lines. The levels of the normal 6- and 7-kb c-abl 
RNAs and the 8-kb bcr-abl RNA were analyzed by blot hybridization 
using a v-abl probe. RNA was extracted from Ph l -negative lines 
SK-CML7BN-2 (A,—) and SK-CML16BN-1 (B,— ), from Prepos- 
itive lines SK-CML6BI-33 (A,+) and.SK-CML16Bt-3 (B.+), and 
from line K562 (C,+) by the NaDodS0 4 /urea/phenol method (20). 
Polyadenylylated RNA was purified by oligo(dT) affinity chroma- 
tography, and 15 /ig of each sample was electrophoresed in a 1% 
agarose/formaldehyde gel and then transferred to nitrocellulose. The 
blotted RNAs were hybridized with a mckrtranslaled v-afc/ fragment 
probe (21) and then autoradiographed for 4 days. . 



by factors influencing the transcription rate of the bcr-abl 
gene or the stability, of the mRNA. : .. 

DISCUSSION 

Several lines of evidence suggest that formation of Ph 1 is not 
* the primary event that affects the stem cell in CML. Patients 
have been identified that present with the clinical picture of 
CML but only later develop Ph 1 (1). This observation, 
coupled with studies of G6PD (glucose-6-phosphate dehy- 
drogenase)-heterozygous females with CML that demon- 
strate stem-cell clonality by isozyme analysis among , cell 
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Fio. 4. Southern blot analysis of abl sequences in Fh^positive 
and -negative B-cell lines.. High molecular weight DNA (15 /ig) was 
digested with restriction endonuciease BamKl, separated in a 0.8% 
agarose gel, and then transferred to nitrocellulose. The blotted DNA 
fragments were hybridized with a nick-translated, 2.4-kb Bgl U \-abl 
fragment (1.5 x 10* cpm/Mg; ref. 21) and exposed for 4 days. (A) 
Autoradiogram of aM-specific fragments in cell lines HL-60 (lane 1), 
EM2 (lane 2), K562 (lane 3), SK-CML7Bt-33 (lane 4), SK-CML8B1-3 
(lane 5), SK-CML16BM (lane .6), SK-CML21BI-6 (lane 7), SK- 
CML35Bt-2 (lane 8), SK-CML7BN-2 (lane 9), SK-CML8BN-2 (lane 
10)', and SK-CML35BN-1 (lane 11). (£) Ethidium bromide staining of 
agarose gel prior to transfer to nitrocellulose, showing the level of 
variation in amount of DNA loaded per lane. . 
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populations that lack the Ph 1 marker, supports a secondary 
or complementary role for Ph l in the progression of the 
disease (24, 25). This chromosome marker is found in 
chronic, accelerated, and blast-crisis phases of the disease. It 
is likely that Ph l confers some growth advantage, since ceUs 
with the marker chromosome eventually predominate the 
marrow and peripheral blood even in chronic phase. During 
the phase of blast crisis, many patients develop additional 
chromosome abnormalities, including duplication oT Ph, a 
variety of trisomies, and complex translocations (26). This 
is suggestive evidence for Ph 1 being a necessary but not 
sufficient genetic change for the full evolution of the 

dl The C realization that one molecular result of Ph 1 is the 
generation of a chimeric bcr-abl protein with functional 
characteristics and structure analogous to the gag-abl trans- 
forming protein of the Abelson murine leukemia virus 
strengthens the argument for an important role of Ph in the 
pathogenesis of CML. Although the Abelson virus is gener- 
ally considered a rapidly transforming retrovirus, its effects 
can range from overcoming growth factor requirements, to 
cellular lethality, to induction of highly oncogenic tumors in 
a number of .hematopoietic cell lineages (27, 28). Even m the 
transformation of murine cell targets, there are several lines 
of evidence that suggest that the growth-promoting activity of 
the \-abl gene product is complemented by further cellular 
changes in the production of the malignant-cell phenotype 
(29-31). 

The regulation of bcr-abl gene expression is complex 
because the 5' end of the gene is derived from the non-aW 
sequences, bcr, normally found on chromosome 22 (6). The 
level of stable message for the normal bcr gene and the 
normal abl gene are both much lower than the level of the 
bcr-abl message and protein from cell lines and clinical 
specimens derived from myeloid blast-crisis patients (5, 6, 
11) Therefore, the high level of bcr-abl expression cannot 
simply be attributed to the regulatory sequences associated 
with bcr. Possibly, creation of the chimeric gene disrupts the 
normal regulatory sequences and results in a higher level of 
expression. Variation in bcr-abl expression may result from 
secondary changes in the structure of the chimenc gene or 
function of fnmi-acting factors that occur during evolution of 
the disease. Our analysis of P210 c '* bl and the *kb mRNA m 
Epstein-Barr virus-transformed Ph l -positive B-cell lines 
demonstrates that stable message and protein levels from the 
bcr-abl gene can vary over a wide range. This variation does 
not result from a change in the number of bcr-abl templates 
secondary to gene amplification but more likely from changes 
in either transcription rate or mRNA stability. We suspect 
this range of bcr-abl expression is not limited to lymphoid 
cells. Analysis of peripheral blood leukocytes derived from 
an unusual CML patient who has been in chronic phase with 
myeloid predominance for 16 years showed a level of 
P210 c -* bl one-fifth that of P145 c - abf , as detected by metabolic 
labeling with [ 32 P]orthophosphate and immunoprecipitation 
(S.C., O.N.W., and P. Greenberg, unpublished observa- 
tions). Lower levels of expression of the. chimeric mRNA 
have been demonstrated in clinical samples from chrome- 
phase CML patients compared to acute-phase CML patients 
(9), Others have reported chronic-phase patients with vari- 
able but, in some cases) relatively high levels ofthe bcr-abl 
mRNA (32). The sampling variation and the heterogenous 
mixture of cell, types in clinical samples complicate such 
analyses. Further work is needed to evaluate whether there 
is a defined change in P210 c - M expression during the pro- 
gression of CML. It is interesting to note that among the 
limited sample of Prepositive B-cell lines we have examined 
(Table 1), we have seen higher levels of P210 c * w in those 
derived from patic v. ■ at more advanced stages ofthe disease. 
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It will be important to search for cell-type-specific mecha- 
nisms that might regulate expression of bcr-abl from Ph . 
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CHAPTER 29 



Regulation of transcription 



The phehotypic differences that distinguish the 
various kinds of cells in a higher eukaryote are 
largely due to differences in the expression of 
genes that code for proteins, that is, those tran- 
scribed by RNA polymerase 1J. In principle, the 
expression of these genes might be regulated at 
any one of several stages. The concept of the 
"level of control" implies that gene expression 
is not necessarily an automaUc process once it 
has begun. It could be regulated in a gene- 
specific way at any one of several sequential 
steps- We can distinguish (at least) five poten- 
tial control points, forming the series: 

Activation of gene structure 

Initiation of transcription 
I 

Processing the transcript 
X 

Transport to cytoplasm 
I 

Translation of mRNA 

The existence of the first step is implied by 
the discovery that genes may exist in either of 
structural conditions. Relative to the state 
of most of the genome, genes are found in 
an "active- state in the cells in which they 
are expressed (see Chapter 27). The change of 
structure is distinct from the act of transcrip- 

H , » 3nd fadica,cs lhat ^ Sene is "transcrip 
able." This suggests that acquisition of the 

active" struclure must be the nrsi step in *enc 
expression. 

Transcription of a gene in Ihc nctive Male is 



controlled at the stage of initiation, mat is, by 
the interaction of RNA polymerase with its pro- 
moter. This is now becoming susceptible to 
analysis in the in vitro systems (see Chapter 
28). For most genes, this is a major control 
point; probably it is the most common level of 
regulation. 

There is at present no evidence for control 
at subsequent stages of transcription in eukary- 
otic cells, for example, via antitermination 
mechanisms. 

The primary transcript is modified by capping 
at the 5' end, and usually also by polyadenyla- 
tion at the 3' end: Introns must be spliced out 
from the transcripts of interrupted genes. The 
mature RNA must be exported from the nucleus 
to the cytoplasm. Regulation of gene expression 
by selection of sequences at the level of nuclear 
BNA might involve any or all of these stages, 
but the one for which we have most evidence 
concerns changes in splicing; some genes are 
expressed by means of alternative splicing pat- 
terns whose regulation controls the type of pro- 
tein product (see Chapter 30). 

Finally, the translation of an mRNA in the cyto- 
plasm can be specifically controlled. There is little 
evidence for the employment of this mechanism in 
adult somatic cells, but it does occur in some 
embryonic situations, as described in Chapter 7. 
The mechanism is presumed to involve the block- 
ing of initiation of translation of some mRNAs by 
specific protein factors. 

But having acknowledged that control of gene 
expression can occur at multiple stages, and 
that production of RNA cannot inevitably be 
equated wjtii production of protein, .it is clear 
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that the overwhelming majority of regulatory 
events occur at the initiation of transcription. 
Regulation of tissue-specific gene transcription 
lies at the heart of eukaryotic differentiation; 
indeed, we see examples in Chapter 38 in 
which proteins that regulate embryonic devel- 
opment prove to be transcription factors. A reg- 
ulatory transcription factor serves to provide 



common control of a. large number of target 
genes, and we seek to answer two questions 
about this mode of regulation: what identifies 
the common target genes to the transcription 
factor, and how is the activity of the transcrip- 
tion factor itself regulated in response to intrin- 
sic or extrinsic signals? 



-Response elements identify genes under common 
regulation 



The principle that emerges from characterizing 
groups of genes under common control is that 
they share a promoter element that is recognized 
by a regulatory transcription factor. An element 
that causes a gene to respond to such a factor 
is caUed a response element; examples are the 
. HSE (heat shock response element), GRE 
(glucocorticoid response element), SRE (serum 
response element). 

The properties of some inducible transcription 
factors and the elements that they recognize are 
summarized in Table 29.1. Response elements 
have the same general characteristics as 
upstream elements of promoters or enhancers. 
They contain short consensus sequences, and 
copies of the response elements found in dif- 
ferent genes are closely related, but not neces- 
sarily identical. The region bound by the factor 
extends for a short distance on either side of 



Table 29.1 Inducible transcription factors bind to ' 
response elements that identify groups cf promoters . 
" or enhancers subject to coordinate control..' : 



Regulatory Agent Module Consensus Factor 

Heal shock HSE CNNGAANNTCCNNG HSTF 

Glucocorticoid GRE TGGTACAAATGTTCT Receptor 

Pnorbot ester TRE TGACTCA API 

Serum SRE CCATATTAGG SRF 



the consensus sequence. In promoters, the ele- 
ments are not present at fixed distances from 
the startpoint, but are usually <200 bp upstream 
of it The presence of a single element usually 
is sufficient to confer the regulatory response, 
but sometimes there are multiple copies. 

Response elements may be located in pro- 
moters or in enhancers. Some types of elements 
are typically found in one rather than the other: 
usually an HSE is found in a promoter, while a 
GRE is found in an enhancer. We assume that 
all response elements function by the same 
general principle. A gene is regulated by a 
sequence at the promoter or enhancer that is 
recognized by a specific protein. The protein 
Junctions as a transcription factor needed for 
RNA polymerase to initiate Active protein is 
available only under conditions when the gene is 
to be expressed; its absence means that the pro- 
moter is not activated by this particular circuit 
An example of a situation in winch many 
genes are controlled by a single factor is pro- 
vided by the heat shock response. This is com- 
mon to a wide range of prokaryotes and 
eukaryotes and involves multiple controls of : 
gene expression: an increase in temperature 
turns off transcription of some genes, turns on 
oanscripbon of the heat shock genes, and 
causes changes in the translation of mRNAs. 
The control of the heat shock genes illustrates 
the differences . between prokaryouc and 
eukaryotic modes of control. In bacteria, a ne^ 
sigma factor is synthesized that directs RNA 
poJym erase boloenzyrae to recognize an alter- 
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Discordant Protein and rnRNA Expression in 
Lung Adenocarcinomas* 

Guoan Cheirfc Tarek G. Gharibt, Chiang-Ching Huang§, Jeremy M. G. Taylor§, 
David E. Miseklf, Sharon L R. Kardiaf], Thomas J. Giordano**, Mark D. lannettoni*, 
Mark B. Orringerf, Samir M. HanashU and David G. Beer± fct 



The relationship between gene expression measured at 
the mRNA level and the corresponding protein level Is not 
well characterized In human cancer. In this study, we 
compared rnRNA and protein expression for a cohort of 
genes In the same lung adenocarcinomas. The abun- 
dance of 165 protein spots representing 98 Individual 
genes was analyzed In 76 lung adenocarcinomas and nine 
non-neoplastlc lung tissues using two-dimensional poty- 
acryl amide gel electrophoresis. Specific polypeptides 
were Identified using matrix-assisted laser desorptlon/ 
Ionization mass spectrometry. For the same 85 samples, 
rnRNA levels were determined using oligonucleotide mi- 
croarrays, allowing a comparative analysis of rnRNA and 
protein expression among the 165 protein spots. Twenty- 
eight of the 165 protein spots (17%) or 21 of 98 genes 
(21.4%) had a statistically significant correlation between 
protein and rnRNA expression (r > 0.2445; p < 0.05); 
however, among all 165 proteins the correlation coeffi- 
cient values (r) ranged from -0.467 to 0.442. Correlation 
coefficient values were not related to protein abundance. 
Further, no significant correlation between rnRNA and 
protein expression was found (r = -0.025) If the average 
levels of rnRNA or protein among all samples were applied 
across the 165 protein spots (98 genes). The rnRNA/ 
protein correlation coefficient also varied among pro- 
teins with multiple Isoforms, Indicating potentially sep- 
arate isofomvspecfflc mechanisms for the regulation of 
protein abundance. Among the 21 genes with a signifi- 
cant correlation between rnRNA and protein, five genes 
differed significantly between stage 1 and stage III lung 
adenocarcinomas. Using a quantitative analysis of rnRNA 
and protein expression within the same lung adenocarci- 
nomas, we showed that only a subset of the proteins 
exhibited a significant correlation with rnRNA abundance. 
Molecular & Cellular Proteomics 1:304-413, 20QZ 



Lung cancer Is the leading cause of cancer death for both 
men and women in the United States. Aderrocarrinomas of 
the lung comprise ~4Q% of all new cases of non-small, cell 
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lung cancer and are now the most common histologic type. 
Functional genomics, broadly defined as the comprehensive 
analysis of genes and their products, have become a recent 
focus of the fife sciences (1). Application of these approaches 1o 
lung adenocarcinomas has the potential to aid in the identifica- 
tion of high risk patients with resectable early stage lung cancer 
that may benefit from adjuvant therapy, as well as to identify 
new therapeutic targets. In human lung cancer, however, BttJe Is 
currently understood regarding the relationship between gene 
expression as determined by measuring rnRNA levels and the 
corresponding abundance of the protein products. 

A number of powerful techniques for analysis of gene ex- 
pression have been used including differential display (2), 
serial analysis of gene expression (3)," ONA microa/rays (4), 
and proteomics via two-dimenslonaj'polyacrylamide gel elec- 
trophoresis and mass spectrometry (5). Bfoinformatics tools 
have also been developed to help determine quantitative 
mRNA/protein expression profiles of all types of cells and 
tissues (6) and now can be applied to benign and malignant 
tumors. DNA mlcroarrays (cDNA and oligonucleotide) permit 
the parallel assessment of thousands of genesand have been 
utilized In gene expression monitoring (7), polymorphism anal- 
ysis (8), and DNA sequencing (9). Recent studies have fo- 
cused on classification or identification of subgroups of lung 
tumors using ONA mlcroarrays (10, 11). The use of rnRNA 
expression patterns by themselves, however,' Is insufficient for 
understanding the expression of protein products, as addi- 
tional post-transcriptionaJ mechanisms, Including protein 
translation, post-translationai modification, and degradation, 
may influence the level of a protein present In a given cell or 
tissue. Proteomlc analyses, a complementary technology to 
DNA mlcroarrays for monitoring gene expression, involves 
protein separation and quantitative assessment of protein 
spots using 2D 1 -PAGE and protein identification using mass 
spectrometry. By combining proteomlc and transcriptional 
analyses of the same samples, however, it may be possible to 
understand the complex mechanisms Influencing protein ex- 
pression In human cancer. 4 

In this stuuy.'we determined rnRNA and protein levels for 
1 65 proteins (98 genes) In 76 lung adenocarcinomas and nine 

'The abbreviations used are: 20, two-dlmenstonal; MALDI-MS, 
matrix-assisted laser desorptton/tonizalJan mass spectrometry. 
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Protein and mRNA Correlation In Lung Adenocarcinomas 



Table I 

• Correlation cooffidents of protein and mRNA whew only one spof was pmsant on 2D gels 
f % correlation coefficient value > 0.2445; p < 0.05. Values In boldface are significant at p < 0.05. 

Protein name 
— _ _______ 

tonexin IV 
OJ-1 pfo'tein/MER5 
Superoxide cHsrnutase (Cu-Zn) 
Galectin 1 

Transudation up-feguteted nuclear protein 
Ferritin light chain 
AnnexinV 

26 S proteasome p28 
L-lactate dehydrogenase H chain (LDH-6) 
COX 11 

Priosphogrycemte mutase 
Olhydrofipoarnlde dehydrogenase precursor 
Antioxidant enzyme AOE372 
GRP75 

Pyruvate dehydrogenase E1-p suburut precursor 
Glutathione 5-transferase pi (GST-pQ 
TNoredoxJn 

HO phosp ho ribosyttranst erase 
TransJaUonalry controDed tumor protein (TCTP) 
LAMR 

Adenine phosphoribosyf transferase 
. dUTP pyrophosphatase (dUTPase) 
Pinch-2 protein 

Carbonic anhydrasa-relaied protem; Syntax^ 
ChaperonirHike protein 

GlutaJhione S-transf erase homotog (GST homolog) 
Nm23 (NDPKA) 
RUG (U32331) 

FIFO-type ATP synthase subunJt d 
Huntmgtln Interacting protein 2 (KIP2) 
AmytoW B4A 
Cytokeratln 19 

QTP-Wndlng nudear protein RAN{TC4) 
CathepslnB 

Urokfnase plasminogen activator 
0 1,4-gatectosyl transferase 
Apoflpoprotein A4 (AcoA4) 
aathrin Bght chain A 
Cytosofic Inorganfc pyrophosphatase 
Preprogastrfn-reJeaslng peptide 
Heat shock-Induced protein 
ADP-ribosyktlon factor 1 
HuntJngtJn Interacting protein 1 (HIP1) 
Moesln/E 

Alkaline phosphate, placental 
Protein disulfide Isomerase-feJated protein 5 
Protein kinase C Inhibitor 
Rab 7 p/oteJn 
Albumin 

Lactate dehydrogenase-A (LDHA) 
Hsp89 

GRP78 

NucteartWoride channel (RNCC protein) 
Pulmonary surfactant protein 0 
PCNA 

Neat shock cognate protein, 71 kOa 
14-3-3 {(a 
Ro/ss-A antigen 



Spot 


ImJgene 


Gene name 


r* 


1104 


' Hs,184510 


SFN 


04337 


0994 


Hs.77840 


ANXA4 


04219 


1314 


Hs.10958 


DJ-1 


0.3382 


1454 


Hs.75428 


SOD1 


0.3863 


1638 


Hs£27751 


LGALS1 


0.3318 


0264 


Hs.129548 


HNRPK 


0.3034 


1405 


Hs.1 11334 


FTL 


0.2849 


0963 


Hs.300711 


ANXA5 


0.2468 


1252 


Hs.4745 


PSMC 


(X2445 


.0906 


Me.234489 


IDHB 


0.4420 


1171 


Hs.241515 


COX11 


0.2310 


1160 


Hs.181013 


PGAM1 • 


0.2023 


0759 


Hs.74635 


OLD 


0.1965 


1193 


Hs.83383 


AOE372 


0.1932 


0172 


Hs.3069 


HSPA9B 


0.1872 


0777 


Hs.979 


PDHB 


0.1855 


1249 


Hs.226795 


GSTP1 


0.1773 


1665 


Hs.76136 


TXN 


0.1732 


1205 


Hs.82314 


HPRT1 


0.1588 


. 1230 


Hs.279860 


TFT1 


0.1466 


0603 


Hs.181357 


LAMR1 


0.1463 


1358 


Hs.28914 


APRT 


0.1399 


1410 


Hs.82113 


DOT 


0.1213 


1625 


Hs.1 12378 


UMS1 


0.1213 


0371 


Hs*250502 


CAS 


0.1122 


0269 


Hs.82916 


CCTBA 


0.1106 


1143 


Hs.1 1465 


GSTTlp28 


0.0997 


1456 


Hs.1 18638 


NME1 


0.0932 


1598 


Hs.278503 


RIG 


0.0905 


1354 


HS.89761 


ATP5D 


0.0904 


1445 ' 


Hs.1 55485 


HIP2 


0.0843 


1479 


Hs.1 77486 


APP 


0.0746 


0608 


Hs.182265 


KRT19 


0.0439 


1071 


Hs.10842 


RAN 


0.0277 


0991' 


Hs.297939 


CTB8 


0.0254 


0642 


Hs.77274 


PLAU 


0.0248 


0323 


Hs.198248 


B4GALT1 


0.0183 


0613 


Hs.1247 


APOA4 


0.0176 


1338 


Hs.104143 


CLTA 


0.0123 


0902 


Hs.5123 


StDS-306 


0.0117 


. 1688 


Hs.1473 


GRP 


-0.0040 


0265 


Hs.274402 


HSPA1B 


-0.0071 


1414 


Hs.77541 


ARF5 


-0.0096 


0710 


Hs.97206 


HIP1 


-0.0114 


0532 


Hs.170328 


MSN 


-0.0132 


0525 


Hs.284255 


ALPP 


-0.0148 


0513 


Hs.76901 


PWR 


-0.0289 


1659 


Hs.256697 


HINT 


-0.0312 


1262 


Hs.7016 


RAB7 


-0.0362 


0190 


Hs.184411 


ALB 


-0.0470 


0948 


Hs^795 


LOHA 


-0.0549 


0502 


Hs.180532 


GPJ 


-0.0575 


0152 


Hs.75410 


HSPA5- 


-0,0640 


1054 . 


Hs.74276 


CUC1 


-0.0686 


0709 


Hs.253495 


SFTPD 


-0.0936 


0867 


H3.78998 


PCNA 


-0.0982 


0165 


Hs.180414 


HSPAS 


-0.1014 


1109 


Hs.75103 


YWHAZ 


-0.1018 


0137 


Hs.554 


SSA2 


-0.1032 
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Table \— continued 



Spot 


Unigene 


Gene name 


I* 


Protein name 


0278 


Hs/4112 


TCP1 


-0.1237 


T-comptex protein 1, or subuntt 


1769 


Hs.9614 


NPM1 


-0.1738 


B23/humaMn 


• 00*9- 


Ha.74335 


HSPCB 


-0.2049 


HspSO 


2511 


Hs.153179 


FABP5 


-0.2109 


E-FA8P/£aBP5 


1739 


Hs.16488 


CALR 


-0.2344 . 


CairetfcuJin 32 


1138 


H&401961 


GSTM4 


-0.2438 


Glutathione S-transf erase M4 (GST m4) 


2533 


Hs.77080 


PSMB6 


-0.2512 


Macropain subuntt A 



nonneoplastic lung tissues. Protein levels were determined 
using quantitative 2D-PAGE analysis, and the separated pro- 
tein polypeptides were identified using matrix-assisted laser 
desorptlon/lonizatlon mass spectrometry (MALDJ-MS). The 
corresponding mRNA levels for the Identified proteins within 
the same samples were determined using oligonucleotide 
micro arrays. Correlation analyses showed that protein abun- 
dance is likely a reflection of the transcription for a subset of 
proteins, but translation and posMranslational modifications 
also appear to Influence the expression levels of many Indi- 
vidual proteins In lung adenocarcinomas. 

EXPERIMENTAL PROCEDURES 

Tissues - fifty-seven stage I and 19 stage 111 lung adenocarcino- 
mas, as wetlas nine non-neoptastic lung tissue samples, were used 
for protein and mRNA anaJyses. Patient consent was obtained, and 
the project was approved by the institutional Review Board. AD tis- 
sues were obtained after resection at the University of Michigan 
Hearth System between May 1991 and Jttfy 1998. Tissues were ail 
snap-frozen in liquid nitrogen and then stored at -80 •<?. The patients 
included 46 females and 30 males ranging In age from 40.9 to 84.6 
(average 63.8) -years. Most patients (6676) demonstrated a positive 
smoking history. Sixty-one tumor samples were classified as bron- 
cWaJ-dertved, 14 were classified as bronchoalveofar, and one had 
both features, Eighteen tumor samples were classified as wall differ- 
entiated, 38 were classified as moderate, and 19 were classified as 
poorly differentiated adenocandnomas. Hernatoxytin-stained cryostat 
sections (5 f*m), prepared from the same tumor pieces to be utilized 
for protein and mRNA Isolation, were evaluated by a pathologist and 
compared with hematoxylin- and eosin-statned sections made from 
paraffin blocks of the same tumors. Specimens were excluded from 
analysis if they showed unclear or mixed histology (e.g. adenosqua- 
mous), tumor cefluJarfty less than 70%, potential metastatic origin as 
Indicated by previous tumor history, extensive rymphocytlc Infiltration, 
or fibrosis or ir the patient had received prior chemotherapy or 
radiotherapy. 

Oligonucleotide Array Hybridization -The KuGeneB. dJgonucteo- 
tide arrays (Affymetnx, Santa Clara, CA) containing 6800 genes were 
used in this study. Total RNA was isolated from all samples using 
Trfeol reagent (Invttrogen). The resufUng RNA was then subjected to 
further purification using RNeasy spin columns (Qiagen). Preparation 
of cRNA, hybridization, and scanning of the HuGeneR. arrays were 
performed according to the marojfacturer*s protocol (Afrymetrix, 
Santa Ct^ OA). Data analysis was porfcrrrwd using GansC^ 4.0 
software. The gene expression profile of each tumor was normalized 
to the median gene expression profile for the entire sample. Details of 
data trimming and normalization are described elsewhere (1 1). 

2D-PAGE and Quantitative- Protein Analysis -Ttssua tor both pro- 
t ein and mRNA isolation came from contiguous areas of each sample. 
Protein separation using 2D-PAGE, sliver staining, and digitization 



were performed as described previously (12, 13). Our 2D-PAGE sys- 
tem allows us to run 20 gels at one tone (one batch). Spot detection 
and quantification were accorripOshed utUfcing Bio Image Visage Sys- 
tem software (BfoJmage Corp M Ann Arbor, Ml). The Integrated inten- 
sity of each spot was calculated as the measured optical density 
units X mm 2 . Of the total possible 2000 spots detectable on each gel, 
820 spots on the gel of each sample were matched using a Gel-ed 
match program with the same spots on a chosen •master* get. In 
each sample, 250 ubiquitously expressed reference spots were used 
to adjust for variations between gels, such as that created by subtle 
differences in protein loading or gel staining. Slight dffierences be- 
cause of batch were corrected after spot-size quantfScaUoa 

Mass Spectrometry and 2D Western Blotting- Preparative 2D gels 
were run using extracts from AS49 lung aaerorarcmorna ceDs (ob- 
tained from ATCC) and using the Identical experimental conditions as 
the analytical 2D gals, except 30% more protein was loaded. The 
reserved protein gets were silver-stained using successive incuba- 
tions In 0.0296 sodium thlosuttate for 2 min, 0.1% silver nitrate for 40 
min, and 0.014% formaldehyde pius 2% sodium carbonate for 10 
mla For protein Wentfficatlon, protein polypeptides underwent trypsin 
(flgestion followed by IvtALDl-MS using a MALDI-TOF Voyagar-OE 
mass spectrometer (Persepnve Btosystems, Framlngham, MA). The 
masses were compared with known trypsin digest databases using 
the MS-FIT database (University of CaflbmJa, San Francisco; 
praspector.ucsf.eduAjc^ Some of the polypep- 

tides Included In the analysis had been Ide nttfied prior to this study on 
the basis of sequencing (14). The Wentmed protein spots used In this 
paper are shown In Rg. 14. The method for ZD-PAGE Western blot 
, verfScationwfsasdescrfoe^ 
GRPS8 andOpie are shown in Rg. 1 , C and P t the others, such as 
GRP78, Grt>75 ( HSP70, HSC70, KRT8, KRT18, KRT19, Vimentrn. 
ApoJ, 14-3-3, Annexfn I, AnnexJn II, PGP9.5, DJ-1, GST-pl, and 
FOAM, am described elsewhere. 2 

Statistical AnaJfcfc-IWssing values were replaced with the mean 
value of the protein spot. The transform x log(1 + x) was app&ed 
to norrrtaiize all protein expression values. Tne relationship between 
protein and mRNA expression levels within the same samples was 
examined using the Spearman corr^ 

identify potentially significant correlations between gene and protein 
expression, we used an analytical strategy similar to SAM (signifi- 
cance analysis of rjtfcroarrays) (17), which uses a permutation tech- 
nique to determine the significance of changes tn gene expression 
between different bfologfcal states. To obtain permuted correlation 
coefficients between gone and protein expression, genes were ex- 
changed first In such a way that permutated correlation coefficient 
were calculated based cn poaudo pairs cf genes and proteins. Tha 
<&trtbtffon of pennutaled correlation coefficients became stable after 
60 perrnutaQons. TWs procedure was then repeated 60 times to 
obtain 60 sets of permutated correlation coefficients. For each of the ' 
60 permutations, the correlations of genes and proteins were ranked 

2 Chen el at, submitted for publication. 
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Table II 

Oxidation coefficients of protein and mRNA whew multiple tsotbrms- were present on ZD gets 
r\ correlation coefficient value > 0.2445; p < 0.05. Values In boldface are significant at p < 0.05. 



Spot 



UnJgene 



Gene name 



Protein name 



1494 

0957 

0353 

0855 

1196 

12W 

0523 

1492 

1493 

1181 

0439 

0505 

0593 

1874 

0935 

2524 

2324 

1192 

0350 

0992 

0851 

0853 

2503 

0381 

0371 

1179 

0762 

0760 

2506 

0772 

0723 

1239 

1237 

1234 

0428 

0427 

0424 

0863 

0760 

1527 

1484 

1728 

1712 

0947 

1232 

1229 

1595 

1010 

1459 

1458 

0619 

0615 

1250 . 

C543 

0335 

0333 

0331 

2381 



Hs.81915 

Hs.77899 

Hs.289101 

Hs.1 69476 

Hs.41707 

Hs.83848 

Hs.65114 

Hs.81915 

Hs.81915 

Hs.78225 

HS242463 

Hs.297753 

H*297753 

Hs.75313 

Hs.75544 

Hs.78225 

Hs.65114 

Hs.41707 

Hs.289101 

Hs.75313 

Hs.75313 

Hs.75313 

Hs.76392 

HS76392 

Hs.76392 

Hs.78225 

Hs.78225 

Hs.78225 

Hs^17493 

Hs.217493 

Hs.217493 

HS-93194 

H 3.93 194 

H&93194 

Hs^S 

Hs£5 

Hs.25 

Hs.75106 

Hs.75108 

Hs.1 19140 

Hs.119140 

Hs.5241 

Hs£241 

Hs.169476 

Hs.75207 

Hs.75207 

Hs.15fcjOO 

Hs.75980 

Hs.75990 

Hs.75990 

Hs.75990 

Hs.75990 

Hs.41707 

Hs.75037 

Hs.78037 

Hs.79037 

Hs.79037 

Hs.65114 

Hs.65114 



LAP18 

7PM1 

GRP58 

GAPD 

HSPB3 

TPI1 

KHT18 

LAP18 

LAP18 

ANXA1 

KHT8 

VIM 

VIM . 

AKR1B1 

YWHAH 

ANXA1 

KKT18 

HSP63 

GRP58 

AKR1B1 

AKR1B1 

AKR1B1 

ALDH1 

ALDH1 

ALDH1 

ANXA1 

ANXA1 

ANXA1 

ANXA2 

ANXA2 

ANXA2 

APOA1 

APOA1 

APOA1 

ATP5B 

ATP5B 

ATP5B 

CLU 

CLU 

E1F5A 

EIF5A 

FABP1 

FA8P1 

GAPD 

GL01 

GLOI 

HAP1 

HP 

HP 

HP 

HP 

HP 

HSPB3 
HSP01 
HSPD1 
HSP01 
HSPD1 
KRT18 
KRT18 



0.4003 
0.3930 
O3802 
0.3693 
0.3668 
a3395 
0,3335 
03234 
0.3154 
a3102 
a3048 
0L2939 
a2809 
02790 
02775 
02612 
O20O1 
02558 
02516 
-0.2460 
0.0761 
-0.0875 
-0.0565 
-0.0371 

-0.0680 

0J2O62 
-r0J)739 
-00)228 
0.2223 
, 0.2080 
0.0701 
0.1133 
-0.0373 
-0.0894 
0.0080 
0.0122 
-0.0992 
-0.0483 
-0XW43 
-0.0726 
-0.0376 
-0.1916 
-0.0473 
0.1745 
0.2249 
0.O450 
-0O137 
-0.4672 
0.0802 
-0.0305 
04)401 
-0.0034 
-0.1034 
' 0.1074 
02265 
0.1383 
0.1603 
02016 
0.1106 



OP 18 (Stathmin) 
Tropomyosins 1-5 

Protease disulfide isomerase (GRP58) 
GiyceraJdehyde-3-phospnate dehydrogenase 
Hsp27 . 

Trtose phosphate Isomerase (TPQ 

Cytokeratln 18 

OP 18 (Stathmin) 

OP18 (Stathmin) 

Annexln variant I . 

Cytokeratln 8 

Vimenttn 

Vlmentin 

Aldose reductase 
14-3* i) 
Annexfril 
Cytokeratln 18 
Hsp27 

Phosphofipase C (GRP58) . 
AMoae reductase 
Aldose reductase 
Aldose reductase 
Aldehyde dehydrogenase 
AJdehyda dehydrogenase 
AWehydg dehydrogenase 
Annexln variant I 
Annexin I 
Annexln I 

Upocotin (annexin II) 
Upocotin (annexln II) 
Upocotin 

Apoltpoproteln A1 (ApoA1) 

ApoCpoproteln A1 (ApoA1) 

ApoIIpoproteb A1 (ApcA1) 

ATP synthase 0 subunft precursor 

ATP synthase fi subunft precursor • 

ATP synthase p subunK precursor 

ApoHpoproteln J (ApoJ) 

Apollpoproteln J (ApoJ) 

elMA 

elMA 

L-FABP 

L-fABP 

GlycereJdehyde^hosphate dehydrogenase 

GlyoxeJaseH 

GJyoxaiase-1 

Hurrtlngtin-assodated protein 1 (neuroan 1) 

cr-Haptogtobln 

oc-Haptogfobin 

or4laptogk)bln 

8-haptogtobIn 

B-haptogfobln . 

Hsp?7 

HSO60 

Hspeo 

Hsp60 

Hspeo 

Cytokeratln 18 
Cytokeratln 18 
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Table D— continued 

ConeicLtian coefficients of protein and mRNA where multiple isoforms were present on 2D gels 
r, correlation coefficient value > 0.2445; p < (XQ5. Values fn boldface are significant at p < 0.05. 

Protein name 

CytokeraUn 18 • 
Cytokeratfn 18 
Cytokeratin 18 
Cytokeratin 18 
Cytokeratin 8 
CytokeraUn 8 
Cytokeratfn 8 
Cytokeratin 8 
OP18 (Stathmfn) 
POI{proly^OH-B) 
POI(proJy^40H-B) 
ProhJWtin 
ProNbWn 
Dr-1-AnUtripsIn 
cr-1-Anlftripstn 
cH-Anttotpsfn 

Pulmonary surfactant-associated protein 
.Pulmonary surfactant-essoclated protein 
Troponin T 
Troponin T 

Trtose phosphate Isomerase CTPI) 
Triose phosphate Isomerase (TPI) 
Trtose phosphate Isomerase (TPI) 
Trtose phosphate Isomerase (TPI) 
Triose phosphate isomerase (TPI) 
Triose phosphate isomerase (TPI) 
Tropomysln dean-product ' 
CytoskeletaJ tropomyosin . 
Tropomyosin 
Tropomyosins 1-6 
Transthyretin 
Transthyretin muffimere 

Ublqu'rtin ca/boxyKermina! hydrolase Isozyme LI 
Ubiquftfn ca7bo*yMarminal hydrolase isozyme L1 
UbJquftfn corboxyi^rrrdnal hydrolase Isozyme L1 
Vlmentln 

vTmentin-derrved protein (vktf) 
Vimentfoderrved protein (vld2) 
. Vimentevdenved protein (yld1) 
14-3-3 ij 



Spot 


UnJgene 


Gene name 


r 


0529 


Hs,o5i14 


KRT1B 


0.1279 


0528 


H3.65114 


KRT18 


0.0414 


0527 


Hs.65114 


KRT18 


0.0436 


0514 


Hs£5t14 


KRT18 


0.0733 


0451 


Hs.242463 


KRT8 


-0.0111 


0448 


Hs.242463 


KRT8 


0.0347 


0444 


Ks.242463 


KRTB 


-0.1311 


0443 


Hs.242463 


KRTB 


0.0942 


1488 • 


Hs.81915 


LAP18 


0.0495 


0321 


H s.75655 


P4HS 


-0.0546 


0320 


Hs,75655 . 


P4HB, 


-0.004V 


1063 


Hs, 75323 


PK8 


0.0441 


0837 


Hs.75323 


PHB 


0.1402 


0326 


Hs.297681 


SERPINA1 


-0.0227 


0322 


Hs£97681 


SEHPINA1 


-0.0277 


0241 


Ha.297681 


SERPINA1 


-0.0148 


1280 


Hs.301254 


SFTPA1 


-0.1488 


1278 


K^301254 


SFTPA1 


-0.2040 


0866 


Hs.73980 


TNNT1 


0.1162 


0778 


He. 73980 


"TNNT1 


0.0740 


"1213 


H3.83848 


TPI1 


04)024 


1210 


Hs.83848 


TPI1 


04)490 


1207 


Hs.83848 


TPM 


-0.1615 


1204 


Hs£3848 


TPI1 


0.0209 


1202 


Hs.83848 


TPI1 


04)721 


1161 


Hs.83848 


TPI1 


0.2266 


I0O2 


Hs.77899 


7PM1 


-0.1040 


1039 


Hs.77839 


TPM1 


-0,2999 


1035 


HS.77899 


TPM1 


-0.3821 


0783 


Hs.77899 


7PM1 


0.0757 


1574 


Hs.194366 


TTR 


-043065 


0809 


Hs.194366 


TTR 


04)399 


2202 


Hs.76118 


UCHL1 


-0.0220 


1246 


Hs.76118 


UCHL1 


-0.1261 


1242 


Hs.76118 


UCHL1 


0.1473 


0606 


Ha£97753 


VIM 


04)951 


0594 


Hs.297753 


VIM 


-0.2664 


0508 


Hs.297753 


VIM 


0.1008 


0419 


Hs.297753 


VIM 


0.0032 


1279 


Hs.75544 


YWHAH 


0.0059 



such that ^0 denotes the Ah largest correlation coefficient for pth 
permutation. Hence, the expected corretafion coefficient, pjf), was the 
average over the 60 permutations, p& « 3^ , Pp 0y6O. A scatter plot of 
observed correlations {p(/Jj versus the expected correlations is shown in 
Rg. 2D, For this study, we chose threshold A '=* 0.115 so that correlation 
would be considered signiflcanl if absolute value of 'daference between 
P0 and was o^ater tran the thresrwJd 
with observed correlation coefficient -0.4672) of 1 65 pairs of gene and 
protein expression were called significant In such criteria, and the 
permuted data generated an average of 5.1 falsely significant pairs of 
gene and protein expression. This provided an estimated false dis- 
covery rate {the oercontana of pairs of gena nnd protein expression 
identified by chance) tor our data set. 

RESULTS 

Correlation of Individual Proteins and mRNA Expression 
within Each Tumor- We have examined quantitatively 165 



protein spots on 2D gels representing 98 genes and com- 
pared protefri levels wfth mRNA levels for a cohort of 85 lung 
adenocarcinomas and normal lung samples. Of the 165 pro- 
tein spots, 69 proteins were represented by only one known 
spot on 2D gets for an individual gene, whereas 96 protein 
spots showed multiple protein products from 29 different 
genes. 2D Western blotting verified the proteins identified by 
mass spectrometry when specific antibodies were available. 
Spearman correlation coefficients of the proteins and their 
associated mRNA .'or each protein spot wc-r» generated using 
at) 76 lung adenocardnomas and. nine non-neoplastic lung 
tissues (see Tables t and II, and see Figs. 1 and 2). The 
correlation coefficients (r) ranged from -0.467 to 0,442 (Fig. 
2D}. A total of 28 protein spots {21 genes) were found to have 
a statistically significant correlation between expression of 
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Fig. I A t (figHaJ image of a sflvar-stained 2D-PAGE separation of a stage I Jung adenocarcinoma showing protein spots separated by 
molecular mass (MW) end Isoelectric point (P/). TWenty-dghJ protein spots whose expression levels are correlated with mRNA abundance are 
indicated by the black arrows. B t the outlined areas of A showing protein GRPS8. C, 20 Western blot of GRP58 from the A549 lung 
adenocarcinoma ceil Una D, the outftned areas of A showing the protein boforms of Op18. £, 2D Western blot of Op18 from A549 cells. 



their protein and jnRNA (r > 0.2445; p < 0.05). This accounts 
for 17% (28/165) of the* 165 protein spots. Among the 69 
genes for which only a single protein spot was known (Table 
I), nine genes (9/69, 13%) were observed to show a statisti- 
cally, significant relationship between protein and mRNA 
abundance (r > 0.2445; p < 0.05). The proteins whose ex- 
pression levels were correlated with their mRNA abundance 
included those Involved in signal transduction, carbohydrate 
metabolism, apoptosfs, protein post-transJafionaf modifica- 
tion, structural proteins, and heat shock proteins (Table IP). 

Individual Isoforms of the Same Protein Have Different 
Prvtem/mRNA Correlation Coefnclents-OI the 165 protein 
spots, 96 represent protein products of 29 genes with at least 
two isoforms. Among these 96 protein spots, 19 (19/86 pro- 
tein spots, 20%) showed a statistically significant correlation 
between their protein and mRNA expression if > 02445; p < 
0.05) (Table II) and represented 12 genes (12/29, 41%). IndMd- 
uaJ isoforms of the same protein demonstrated different 
proteh/rnRNA correlation coefficients. For example, 2D-PAGE/ 
Western anatyris revested four isofonro of OP15 differing in 
regards to isoelectric point but similar In molecular weight 
Three of the four isoforms (spots 1 492, 1 493, and 1494) showed 
a statistically significant correlation between their protein and 
mRNA abundance (r = 0,3234. 0.31 54, and 0,4003, respective- 
ly). The forth boform (spot 1488) showed no correlation be- 



tween protein and mRNA expression (r = 0.0495). SimSariy, just 
one of five quantified Isoforms of cytokeratin 8 (spot 439) dem- 
onstrated a statistically significant correlation between protein 
and mRNA ai>unclance (r - 0.3049; p < 0.05) (Table IQ. 

(n addition to differences in the relationship between mRNA 
levels and protein expression among separate Isoforms, some 
genes, with very comparable mRNA levels showed a 24-fold 
difference In their protein expression. Genes with comparable 
protein expression levels also showed up to a 28-fold vari- 
ance in their mRNA levels. 

Lack of Correlation for mRNA and Protein Expression when 
Using Average Tumor Values across A0 16$ Protein Spots (98 
Genesj—The relationship between mRNA and protein expres- 
sion was also examined by using the average expression 
values for aD samples. To analyze this relationship using this 
approach, the average value for each protein or mRNA was 
generated using all 85 lung tissue samples. The range of 
normalized average protein values ranged from -0.0546 to 
0.0973 (raw value a0036 to 4.15)47), and the range for nifiNA 
was from 0 to 1 5260.5 for all 1 65 individual protein spots. The 
Spearman correlation coefficient for the whole data set (165 
protein spots/98 genes) was -0.025 (Rg. 3A). Even for the 26 
protein spots (Rg. 2D) that were found to have a statistically . 
significant correlation between their mRNA and protein, use of 
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Fra. 2. A-C, plots lowing the correlation between mRNA and protein for the three selected gsnes Op1 8, Anrwdn IV, and GAPD for ail 76 
lung adenocarcinomas and nine non-neoptestte lung samples (p < 0.05), 0, distribution of aft 1 65 Spearman correlation coefficients (r) and 
verification anatysfa using SAM . A more detailed description of the method Is provided under "Experimental Procedures," ApproxImateJy 1 7% 
of the 1 65 proteins demonstrate a significant correlation between mRNA and protein levels as demonstrated by the values shown beyond the 
outer range of threshold A * 0.115. Normalized protein values were used, thus negative values for some proteins are observed. 

the average value resulted In a correlation coefficient value of 
-0.035, which was not significant (Fig. 3fl), 

Lack of a Relationship between Pmtefn/mRNA Correlation 
Coefficients and Average Protein Abundance-lo determine 
whether an absolute protein level might Influence the corre- 
lation with mRNA, the mean value of each protein (relative 
abundance) and the Spearman prutefrvmRNA correlation co- 
efficients among afl 85 samples were examined No relation- 
ship between the protein abundance and the correlation co- 
efficients was observed (r = 0.039; p > 0105). A detailed 
analysis of separate subsets of proteins with differing levels of 
abundance (jess than -0.0014; larger than -0.0014, or larger 
than 0.0077) also showed a lack of correlation between mRNA 
and protein expression among the 83 (50%), 82 (50%), and 41 
(25%) of 1 65 total protein spots, respectively if = 0.016, 0.08, 
and 0.1 72, respectively). 

Stag€t-rdatsd Changes in tte PrototnlmRNA Gmelction 
Coefficients— To determine whether the 21 genes (28 protein 
spots) showing a significant correlation between the protein 
and mRNA expression among afl samples demonstrate 
changes In this relationship during tumor progression, the 
correlations were examined separately for stage I (p = 57) and 



III in » 19) lung adenocarcinomas (Table III). The num- 
ber of nonneoplastic lung samples (n = 9) was insufficient for 
a separate correlation analysis of this group. Many of the 
protein spots repress nt one of several known protein Isof orm s 
for a given gene. The majority of genes (16/21) did not differ in 
the protem/mRNA correlation between stage I and stage III 
tumors Indicating a similar regulatory relationship between the 
mRNA and protein spot. GRP-58, PSMC, SOD1, 7PI1, and 
VIM, however, were found to demonstrate significant differ- 
ences In the correlation coefficients between stage I and 
stage ill lung adenocarcinomas. For GRP-58, PSMC, and VIM 
the change In the correlation coefficient was because of a 
relative Increase In protein expression In stage (II tumors. For 
SOD and TPI the change resulted from a relative decrease In 
expression of this specific protein In stage 111. tumors. 

DISCUSSION 

Relatively little Is known about the regulatory mechanisms 
controlling the complex patterns of protein abundance and 
post-transfationaJ modification in tumors. Most reports con- 
cerning the regulation of protein translation have focused on • 
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TABL£ tit 

Stage-dependent analyst* ofpmtefomfWA correction coefficients 



r, corralatJon coeffldenL Values in boldface Indicate a significant difference between stage 1 and stage BL 


Spot 


Gene name 


r (Stage 0 


r (Stage flQ 


Function 


1874 


• AKR181 


0.269. 


0.106 


Carbohydrate metabolism; electron transporter 


2524 


ANXA1 


0.184 


0.572 


Phospholipase Inhibitor; signal transduction 


0994 


ANXA4- 


0.660 


0J362 


PhosphoJipase toWbftor 


0963 


ANXA5 


0.241 


0.390 


PhosphoSpase Inhibitor, calcium binding; phospholipid binding 


1314 


DJ-1 


0.363 


0.354 


Signal transduction 


1405 


m. 


0.126 


0.358 


Iron storage protein 


0855 


GAPO 


0.243 


0581 


Carbohydrate metabolism (glycolysis regulation) 


0350 


GRP58 . 


0,327 


-0.087 


Rlrmul InmsrhiAtfon* nminln rffculfiriA fanmnracA 

\Jiy i WJ UCUKttJUwUVII, ytvwut UKKMIIWJ ISKMIUSAUKf 


0254 


HNRPK 


0:360 


0243 


RNA-blnding protein (RNA processing/modification) 


1192 


HSP83 


0.457 . 


0.633 


Heat shock protein 


0523 


KRT18 


0.115* 


0.371 


StructvraJ protein 


0439 


KKT8 


0.323 


0.436 


Structural protein 


1492 


LAP 18 


0.433 


0,663 


Signal transduction; ceO growth and maintenance 


1638 


LGALS1 


0.200 


0528 


ApoptosJs; cefl adhesion; cell size control 


1252 


PSMC 


0.253 


01060 


Protein degradation 


1104 


SFN 


0.465 


0.475 


Signal transduction (protein kinase C inhibitor) 


1454 


SOD1 


0.352 


O079 


Oxktoreductase 


1203 


TPir 


0.378 


0.009 


Carbohydrate metabolism 


0957 


TP Ml 


0.475 


0226 


Structural protein (muscle); control of heart 


0593 


VIM 


-0.054 


0556 


Structural protein 


0935 


YWHAH 


0.263 


0J210 


Signal transduction 



one or several protein products (18). Cells ef a/. (19) found a 
good correlation between transcript and protein levels among 
40 well resolved, abundant proteins using a proteomlc and 
mlcroarray study of bladder cancer, py comparing the rriRNA 
and protein expression revets within the same tumor samples, 
we found that 1 7% (287165) of the protein spots (21/98 genes) 
show a statistically significant correlation between mRNA and 
protein. These proteins appear to represent a cfiverse group of 
gene products and indude those involved in signal transduc- 
tion, carbohydrate metaboOsrri, protein modrtjcation, ceB struc- 
ture, heat shock, and apoplosis. These results suggest that 
expression of this subset of 165 proteins fe fikefy to be regulated 
at the transcriptional level h these tissues. The majority of the 
protein isoforrns, however, dd not correlate with mRNA levels, 
and thus their expression Is regulated by other mechanisms. We 
also observed a subset of proteins that demonstrated a nega- 
five correlation with the mRNA expression values; for example 
a-haptoglobin demonstrated a strong negative correlation with 
its mRNA expression values. This may reflect negative feecfcack 
on the mRNA or the protein or the presence of other regulatory 
Influences that are not understood currently. 

Post-translations] modification or processing will result in 
Individual protein products of the same gene migrating to 
different locations on 2D-PAGE gels (20). Because the Identity 
of all possible Isoforrns for each protein examined has not 
bsen-cfcarecurfserj rctnptetefy; may influence the corre- 
lation analyses performed in this study. This Is partly because 
of limitations of the 2D-PAGE and mass spectrometry tech- 
nologies (21, 22). Potential inconsistencies between mRNA 
and protein correlations that have been reported may also be 
because of differences, even In the same gene, In the mech- 



anisms of protein translation among different ceils or as 
measured In different laboratories (23). 

In this study, we examined 165 protein spots Identified in 
lung adenocarcinomas. Ninety-six protein spots, representing 
the products of 29 genes, contained at least two protein 
isoforrns. Nineteen of 96 protein spots, representing 12 
germ, were shown to have a statistically significant correla- 
tion between their protein and mRNA expression, suggesting 
that the levels of these proteins reflects the transcription of the 
corresponding genes. Differences in proteWmRNA correlations 
were found among the Individual isoforrns of a given protein. For 
example, of the four OP1 8 isoforrns, three showed a statistically 
significant correlation between the protein and mRNA expres- 
sion levels. The lack of relationship for tie one Isoform, how- 
ever, Indicates that individual protein isoforrns of the same gene 
product can be regulated drfferentiaiiy. This is not unexpected 
and fikely reflects other post-transtatlonaf mechanisms that can 
influence isoform abundance In tissues and cancer. 

In addition to the analyses of the correlation of mRNA/ 
protein within the same tumor samples, we also tested the 
global relationship between mRNA and the corresponding 
protein abundance across all 165 protein spots In the lung 
samples. A protein and mRNA average value for each gene 
was generated using ail 85 lung tissues samples. We ob- 
served a very wide range of normalized average protein and 
mRNA values. The correlation coefficient generated using this 
average value data set was -0.025, and even for the 28 
protein spots that showed a statistically significant correlation 
between Individual mRNA and proteins, the correlation value 
was only -0.035. This suggests that ft Is not possfole to 
predict overall protein expression levers based on average 
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Fig. 3. The overall correlation of 
mRNA and protein levels' across all 
165 protein spots (4) and across 28 ' 
protein spots that contained Individ- 
ual r values larger than 0.244 (0) are 
showa Each protein or mRNA mean 
value was calculated based on aS 76 
lung adenocarcinomas and nine non- 
neoplastic lung samples using quantita- 
tive 2D-PAGE and Affymetrix oligonu- 
cleotide mlcroarrays. The Spearman 
correlation coefficients for the two data 
sets (4 and fl) were -0.025 and -0.035, 
respectively, Indicating a tack of correla- 
tion If mean values for mRNA and protein 
for ail samples Is used. 
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rnRNA abundance In lung cancer samples. This conclusion is. 
also supported by previous results from Anderson and Sei^ 
hamer(24), who examined 19 genes In human Over cells, and 
by Gygi of a/. (25), who examined 106 genes in yeast Both 
studies found a lack of correlation between mRNA and protein 
expression when average or overall levels were used. 

A good correlation was reported when the 11 most abun- 
dant proteins were examined rn yeast £25), euggosttng that the 
level of protein abundance may be a factor that may Influence 
the correlation between mRNA and protein. In the present 
study, a fairly wide range of mean protein values among 165 
protein spots in lung adenocarcinomas was observed, and 
the correlation coefficients also varied from -0.467 to 0.442. 



A comparison between the mean value of each protein and 
the correlation coefficient generated using ail 85 tissue sam- 
ples did not reveal a strong relationship between the overall 
protein abundance and the correlation coefficients (r = 0.039; 
p > 0.05). Detailed analysis of different subsets of protein abun- 
dance also failed to show a correlation between mRNA and 
protein expression. Thus In contrast to yeast, a raJatlonsh^ 
aeiween mfiiWpfotein correlation coefficient and protein 
abundance In human lung acterorarcfriornas was not observed. 

The results of this study Indicate that the level of protein 
abundance In lung adanocardnomas is associated with the 
responding levels of mRNA In 17% (28 proteins) of the 
total 165 protein spots examined. This was substantially 
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higher than the amount predicted to result by chance alone 
(which was 5.1) and suggests that a transcriptional mecha- 
nism likely underlies the abundance of these proteins in* lung 
adenocarcinomas. We also demonstrate that the expression 
of individual isoforms of the same protein may or may not 
correlate with the mRNA, Indicating that separate and likely 
post-translalionaJ mechanisms account for the regulation of 
isoform abundance. These mechanisms may also account for 
the differences in the correlation coefficients observed between 
stage I and stage 10 tumors, indicating that specific protein 
isoforms show regulatory changes during tumor progression. 
Further studies in king adenocarcinomas wfll examine the rela- 
tionship between the expression of individual protein Isoforms 
and specific cfinlcaJ-pathoiogicaf features of these tumors, such 
as the presence of angiofymphatic invasion, and nodal or pleu- 
ral surface involvement The potential to Identify specific protein 
isoforms associated with biological behavior In lung adenocar- 
cinomas would be of considerable Interest and will add to cur 
understanding of the regulation of gene products by transcrip- 
tional, transnational, and post-translationaJ mechanisms. 
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In this study, wc examined yeast proteins by two-dimensional (2D) gel electrophoresis and gathered quan- 
titative information from about 1,400 spots. We found that there is an enormous range of protein abundance 
and, for identified spots, a good correlation between protein abundance, mRNA abundance, and codon bias. 
For each molecule of well-translated mRNA, there were about 4,000 molecules of protein. The relative 
abundance of proteins was measured in glucose and ethanol media. Protein turnover was examined and found 
to be insignificant for abundant proteins. Some phosphoproteins were identified. The behavior of proteins in 
differential centrifugation experiments was examined. Such experiments with 2D gels can give a global view of 
the yeast proteome. 



The sequence of the yeast genome has been determined (9). 
More recently, the number of mRNA molecules for each ex- 
pressed gene has been measured (27, 30). The next logical level 
of analysis is that of the expressed set of proteins. We have 
begun to analyze the yeast proteome by using two-dimensional 
(2D) gels. 

2D gel electrophoresis separates proteins according to iso- 
electric point in one dimension and molecular weight in the 
other dimension (21), allowing resolution of thousands of pro- 
teins on a single gel. Although modern imaging and computing 
techniques can extract quantitative data for each of the spots in 
a 2D gel, there are only a few cases in which quantitative data 
have been gathered from 2D gels, 2D gel electrophoresis is 
almost unique in its ability to examine biological responses 
over thousands of proteins simultaneously and should there- 
fore allow us a relatively comprehensive view of cellular me- 
tabolism. 

We and others have worked toward assembling a yeast pro- 
tein database consisting of a collection of identified spots in 2D 
gels and of data on each of these spots under various condi- 
tions (2, 7, 8, 10, 23, 25). These data could then be used in 
analyzing a protein or a metabolic process. Saccharomyces 
cerevisiae is a good organism for this approach since it has a 
well-understood physiology as well as a large number of mu- 
tants, and its genome has been sequenced. Given the sequence 
and the relative lack of introns in 5. cerevisiae, it is easy to 
predict the sequence of the primary protein product of most 
genes. This aids tremendously in identifying these proteins on 
2D gels. 

There are three pillars on which such a database rests: (i) 
visualization of many protein spots simultaneously, (ii) quan- 
tification of the protein in each spot, and (iii) identification of 
the gene product for each spot. Our first efforts at visualization 
and identification for S. cerevisiae have been described else- 
where (7, 8). Here we describe quantitative data for these 
proteins under a variety of experimental conditions. 

MATERIALS AM) METHODS 
Strains und media. S. cavvisiuc W303 (MATa add- 1 his3-U.I5 ku2-3. ill 
trpf-l uru3-l canl-f00) wns used (26). -Met YNB (yeast nitrogen base) medium 
whs 1.7. g of YNB (Difcu) per liter. 5 g of ammonium sulfate per liter, and 
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adenine, uracil, and all amino acids except methionine; -Met — Cys YNB me- 
dium was the same but without methionine or cysteine. Medium was supple- 0 
mented with 2% glucose (for most experiments) or with 2% ethanol (for ethanol o 
experiments). Low-phosphate YEPD was described by Warner (28). § 

Isotopic labeling of yeast and preparation of ceil extracts. Yeast strains were o 
labeled and proteins were extracted as described by Garrels et al. (7. 8). Briefly. o. 
cells were grown to 5 x 10' 1 cells per ml. al 30°C; 1 ml of culture was transferred q. 
to a fresh tube, and 0.3 mCi of [ 35 S]methioninc (e.g.. Express protein labeling ^ 
mix; New England Nuclear) was added to this l-ml culture. The cells were ^ 
incubated for a further 10 to 15 min and then transferred to a 1.5-ml microccn- ^ 
trifuge tube, chilled on ice, and harvested by centrifugation. The supernatant was o 
removed, and the cell pellet was resuspended in 100 u.1 of lysis buffer (20 mM P* 
Tris-HCI [pH 7,6], 10 mM NaT". 10 mM sodium pyrophosphate. 0.5 mM EDTA, w 
0.1% dcoxycholate: just before use, phenylmcthylsulfonyl fluoride was added to P 
1 mM. leupeptin was added to 1 u.g/ml, pepstatin was added to I u,g/ml. tosyl- 
sulfonyl phenylalanyl chloromethyl ketone was added to 10 u,g/ml ; and soybean l=r 
trypsin inhibitor was added to 10 jxg/ml). 

The resuspended cells were transferred to a screw-cap 1.5-ml polypropylene § 
tube containing 0.28 g of glass beads (0.5-mm diameter; Biospec Products) or o 
0.40 g of zirconia beads (0.5-mm diameter; Biospec Products). After the cap was ?> 
secured, the tube was inserted into a MiniBeadbeater S (Biospec Products) and ^ 
shaken at medium high speed at 4°C for I min. Breakage was typically 75%. cr 
Tubes were then spun in a microcentrifuge for 10 s at 5,000 X g at 4°C. ^ 

With a very fine pipette tip, liquid was withdrawn from the beads and trans- P 1 
ferred to a prechilled 1.5-ml tube containing 7 uJ of DNase I (0.5 mg/ml: Cooper g 
product no. 6330)-RNase A (0.25 mg/ml; Cooper product no. 5679 )-Mg (50 mM o 
MgCU) mix. Typically 70 uJ of liquid was recovered. The mixture was incubated 00 
on ice for 10 min to allow the RNase and DNase to work. 

Next, 75 jil of 2X dSDS (2x dSDS is 0.6% sodium dodecyl sulfate [SOS], 2% 
mcrcaptoeihanol. and 0.1 M Tris-HCI [pH 8j) was added. The tube was plunged 
into boiling water, incubated for 1 min, and then plunged into ice. After cooling, 
the tube was centrifuged at 4°C for 3 min at 14,000 x g. The supernatant was 
transferred to a fresh tube and frozen at -70°C. About 5 u,l of this supernatant 
was used for each 2D gel. 

2D pnlvacrylamide gels. 2D gels were made and run as described elsewhere 
(6-8). 

Image analysis of the gels. The Quest II software system was used for quan- 
titative image analysis (20, 22). Two techniques were used to collect quantitative 
data for analysis by Quest II software. First, before the advent of phosphorim- 
agers, gels were dried and fluorographed. Each gel was exposed to film for three 
different times (typically 1 day, 2 weeks, and 6 weeks) to increase the dynamic 
range of the data. The films were scanned along with calibration strips to relate 
film optical density to disintegrations per minute in the gels and analyzed by the 
software to obtain a linear relationship between disintegrations per minute in the 
spots and optical densities of the film images. The quantitative data are ex- 
pressed as parts per million of the total cellular protein. This value is calculated 
from the disintegrations per minute of the sample loaded onto the gel and by 
comparing the film density of each data spot with density of the film over the 
calibration strips of known radioactivity exposed to the same film. This yields the 
disintegrations per minute per millimeter for each spot on the gel and thence its 
parts-per-minute value. 

After the advent of phosphor imaging, gels bearing "S-labcled proteins were 
exposed to phosphorimagcr screens and scanned by a Fuji phosphor imager, 
typically for two exposures per gel. Calibration strips of known radioactivity were 
exposed simultaneously. Scan data from the phosphorimagcr was assimilated by 
Quest II software, and quantitative data were recorded for the spots on the gels. 
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Measurements of protein turnover. Cells in exponential phase were pulse- 
labeled with [' ,5 S]meihionine, excess cold Met and Cys were added, and samples 
of equal volume were taken from the culture at intervals up to 90 min (in one 
experiment) or up to 160 min (in a second experiment). Incorporation of M S into 
protein was essentially 100% by the first sample (10 min). Extracts were made, 
and equal fractions of the samples were loaded on 2D gels (i.e.. the different 
samples had different amounts of protein but equal amounts of 35 S). Spots were 
quaniitaied with a phosphorimaging and Quest software. 

The software was queried for spots whose radioactivity decreased through the 
time course. The algorithm examined all data points for all spots, drew a best-fit 
line through the data points, and looked for spots where this line had a statis- 
tically significant negative slope. In one of the experiments, there was one such 
spot. To the eye. this was a minor, unidentified spot seen only in the first two 
samples (10 and 20 min). In the other experiment, the Quest software found no 
spots meeting the criteria. Therefore, we concluded that none of the identified 
spots (and all but one of the visible spots) represented proteins with long 
half-lives. 

Centrifugal fractionation. Cells were labeled, harvested, and broken with glass 
beads by the standard method described above except that no detergent (i.e., no 
deoxycholate) was present in the lysis buffer. The crude lysate was cleared of 
unbroken cells and large debris by ccntrifugation at 300 x g for 30 s. The 
supernatant of this centrifugation was then spun at 16,000 x g for 10 min to give 
the pellet used for Fig. 613. The supernatant of the 16.000 x g, 10-min spin was 
then spun at 100.000 x g for 30 min to give the supernatant used for Fig. 6A. 

Protein abundance calculations. A haploid yeast cell contains about 4 X 10~ 12 
g of protein (I, 15). Assuming a mean protein mass of 50 kDa, there are about 
50 X 10* molecules of protein per cell. There are about 1.8 methionines per 10 
kDa of protein mass, which implies 4.5 X 10 s molecules of methionine per cell 
(neglecting the small pool of free Met). We measured (i) the counts per minute 
in each spot on the 2D gels, (ii) the total number of counts on each gel (by 
integrating counts over the entire gel), and (iii) the total number of counts 
loaded on the gel (by scintillation counting of the original sample). Thus, we 
know what fraction of the total incorporated radioactivity is present in each spot. 
After correcting for the methionine (and cysteine (see below]) content of each 
protein, we calculated an absolute number of protein molecules based on the 
fraction of radioactivity in each spot and on 50 x 10 h total molecules per cell. 

The labeling mixture used contained about one-fifth as much radioactive 
cysteine as radioactive methionine. Therefore, the number of cysteine molecules 
per protein was also taken into account in calculating the number of molecules 
of protein, but Cys molecules were weighted one-fifth as heavily as Met mole- 
cules. 

mKNA abundance calculations. For estimation of mRNA abundance, we used 
SAGE (serial analysis of gene expression) data (27) and Asymetrix chip hybrid- 
ization data (29a, 30). The mRNA column in Table 1 shows mRNA abundance 
calculated from SAGE data alone. However, the SAGE data came from cells 
growing in YEPD medium, whereas our protein measurements were from ceils 
growing in YN13 medium. In addition. SAGE data for low-abundance mRNAs 
suffers from statistical variation. Therefore, we also used chip hybridization data 
(29a. 30) for mRNA from cells grown in YNB. These hybridization data also had 
disadvantages. First, the amounts of high-abundance mRNAs were systemati- 
cally underestimated, probably because of saturation in the hybridizations, which 
used 10 pg of cRNA. For example, the abundance of ADHl mRNA was 197 
copies per cell by SAGE but only 32 copies per cell by hybridization, and the 
abundance of ENQ2 mRNA was 248 copies per cell by SAGE but only 41 by 
hybridization. When the amount of cRNA used in the hybridization was reduced 
to 1 |ig. the apparent amounts of mRNA were similar to the amounts determined 
by SAGE (29a. 29b). However, experiments using 1 u-gof cRNA have been done 
for only some genes (29a). Because amounts of mRNA were normalized to 
15.000 per cell, and because the amounts of abundant mRNAs were underesti- 
mated, there is a 2.2-fold overestimate of the abundance of nonabundant 
mRNAs. We calculated this factor of 2.2 by adding together the number of 
mRNA molecules from a large number of genes expressed at a low level for both 
SAGE data and hybridization data. The sum for the same genes from hybrid- 
ization data is 2.2-fold greater than that from SAGE data. 

To take into account these difficulties, we compiled a list of "adjusted" mRNA 
abundance as follows. For all high-abundance mRNAs of our identified proteins, 
we used SAGE data. For all of these particular mRNAs, chip hybridization 
suggested that mRNA abundance was the same in YEPD and YNB media. For 
medium-abundance mRNAs, SAGE data were used, but when hybridization 
data showed a significant difference between YEPD and YNB, then the SAGE 
data were adjusted by the appropriate factor. Finally, for low-abundance 
mRNAs. we used data from chip hybridizations from YNB medium but divided 
by 2.2 to normalize to the SAGE results. These calculations were completed 
without reference to protein abundance. 

CA1. The codon adaptation index (CA1) was taken from the yeast proteome 
database (YPD) (13). for which calculations were made according to Sharp and 
Li (24). Briefly, the index uses a reference set of highly expressed genes to assign 
a value to each codon. and then a score for a gene is calculated from the 
frequency of use of the various codons in that gene (24). 

Statistical analysis. The JMP program was used with the aid of T. Fully. The 
JMP program showed that neither mRNA nor protein abundances were nor- 
mally distributed; therefore. Spearman rank correlation coefficients (r t ) were 



calculated. The mRNA (adjusted and unadjusted) and protein data were also 
transformed so that Pearson product-moment correlation coefficients (r ) could 
be calculated. First, this was done by a Box-Cox transformation of log-trans- 
formed data. This transformation produced normal distributions, and an r p of 
0.76 was achieved. However, because the Box-Cox transformation is complex, we 
also did a simpler logarithmic transformation. This produced a normal distribu- 
tion for the protein data. However, the distribution for the mRNA and adjusted 
mRNA data was close to, but not quite, normal. Nevertheless, we calculated the 
r p and found that it was 0.76. identical to the coefficient from the Box-Cox 
transformed data. We therefore believe that this correlation coefficient is not 
misleading, despite the fact that the log(mRNA) distribution is not quite normal. 



RESULTS 

Visualization of 1,400 spots on three gel systems. Yeast 
proteins have isoelectric points ranging from 3.1 to 12.8. and 
masses ranging from less than 10 kDa to 470 kDa. It is difficult 
to examine all proteins on a single kind of gel, because a gel 
with the needed range in pi and mass would give poor resolu- 
tion of the thousands of spots in the central region of the gel. 
Therefore, we have used three gel systems: (i) pH "4 to 8" with 
10% polyacrylamide; (ii) pH ;i 3 to 10" with 10% polyacryl- 
amide; and (iii) nonequilibrium with 15% polyacrylamide (7, 
8). Each gel system allows good resolution of a subset of yeast o 
proteins, | 

Figure 1 shows a pH 4-8, 10% polyacrylamide gel. The pH o 
at the basic end of the isoelectric focusing gel cannot be main- g- 
tained throughout focusing, and so the proteins resolved on ^ 
such gels have isoelectric points between pH 4 and pH 6.7. For o 
these pH 4-8 gels, we see 600 to 900 spots on the best gels after | 
multiple exposures. g. 

The pH 3—10 gels (not shown) extend the pi range somewhat b 
beyond pH 7.5, allowing detection of several hundred addi- 3 
tional spots. Finally, we use nonequilibrium gels with 15% J 
acrylamide in the second dimension. These allow visualization o* 
of about 100 very basic proteins and about 170 small proteins © 
(less than 20 kDa). In total, using all three gel systems, about ^ 
1,400 spots can be seen. These represent about 1,200 difl'erent » 
proteins, which is about one-quarter to one-third of the pro- | 
teins expressed under these conditions (27, 30). Here, we focus g" 
on the proteins seen on the pH 4-8 gels. ^ 

Although nearly all expressed proteins are present on these n> 
gels, the number seen is limited by a problem we call coverage, g 
Since there are thousands of proteins on each gel, many pro- 
teins comigrate or nearly comigratc. When two proteins are 
resolved, but are close together, and one protein spot is much 
more intense than the other, a problem arises in visualizing the 
weaker spot: at long exposures when the weak signal is strong 
enough for detection, the signal from the strong spot spreads 
and covers the signal from the weaker spot. Thus, weak spots 
can be seen only when they are well separated from strong 
spots. 

For a given gel, the number of detectable spots initially rises 
with exposure time. However, beyond an optimal exposure, the 
number of distinguishable spots begins to decrease, because 
signals from strong spots cover signals from nearby weak spots. 
At long exposures, the whole autoradiogram turns black. Thus, 
there is an optimum exposure yielding the maximum number 
of spots, and at this exposure the weakest spots are not seen. 

Largely because of the problem of coverage, the proteins 
seen are strongly biased toward abundant proteins. All identi- 
fied proteins have a CAI of 0.18 or more, and we have iden- 
tified no transcription factors or protein kinases, which are 
nonabundant proteins. Thus, this technology is useful for ex- 
amining protein synthesis, amino acid metabolism, and glyco- 
lysis but not for examining transcription, DNA replication, or 
the cell cycle. 
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FIG. I. 2D gels. The horizontal axis is the isoelectric focusing dimension, which stretches from pH 6.7 (left) to pH 4.3 (right). The vertical axis is the polyacrylamide 
gel dimension, which stretches from about 15 kDa (bottom) to at least 130 kDa (top). For panel A, extract was made from cells in log phase in glucose; for panel B. 
cells were crown in eihanol. The spots labeled 1 through 6 are unidentified proteins highly induced in ethanol. 
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Spot identification. The identification of various spots has 
been described elsewhere (7, 8). At present, 169 different spots 
representing 148 proteins have been identified. Many of these 
spots have been independently identified (2, 10, 23, 25). The 
main methods used in spot identification have been analysis of 
amino acid composition, gene overexpression, peptide se- 
quencing, and mass spectrometry. 

Pulsc-chasc experiments and protein turnover. Pulse-chase 
experiments were done to measure protein half-lives (Materi- 
als and Methods). Cells were labeled with [ 35 S]methionine for 
10 min, and then an excess of unlabeled methionine was added. 
Samples were taken at 0, 10, 20, 30, 60, and 90 min after the 
beginning of the chase. Equal amounts of 35 S were loaded from 
each sample; 2D gels were run, and spots were quantitated. 
Surprisingly, almost every spot was nearly constant in amount 
of radioactivity over the entire time course (not shown). A few 
spots shifted from one position to another because of post- 
translational modifications (e.g., phosphorylation of RpaO and 
Efbl ). Thus, the proteins being visualized are all or nearly all 
very stable proteins, with half-lives of more than 90 min. Gygi 
et al. (10) have come to a similar conclusion by using the N-end 
rule to predict protein ha If- lives. This result does not imply 
that all yeast proteins are stable. The proteins being visualized 
are abundant proteins; this is partly because they are stable 
proteins. 

Protein quantitation. Because all of the proteins seen had 
effectively the same half-life, the abundance of each protein 
was directly proportional to the amount of radioactivity incor- 
porated during labeling. Thus, after taking into account the 
total number of protein molecules per cell, the average content 
of methionine and cysteine, and the methionine and cysteine 
content of each identified protein, we could calculate the abun- 
dance of each identified protein (Tables 1 and 2; Materials and 
Methods). About 1,000 unidentified proteins were also quan- 
tified, assuming an average content of Met and Cys. 

Many proteins give multiple spots (7, 8). The contribution 
from each spot was summed to give the total protein amount. 
However, many proteins probably have minor spots that we are 
not aware of, causing the amount of protein to be underesti- 
mated. 

When the proteins on a pH 4-8 gel were ordered by abun- 
dance, the most abundant protein had 8,904 ppm, the 10th 
most abundant had 2,842 ppm, the 100th most abundant had 
314 ppm, the 500th most abundant had 57 ppm, and the 
1,000th most abundant (visualized at greater than optimum 
exposure) had 23 ppm. Thus, there is more than a 300-fold 
range in abundance among the visualized proteins. The most 
abundant 10 proteins account for about 25% of the total pro- 
tein on the pH 4-8 gel. the most abundant 60 proteins account 
for 50%, and the most abundant 500 proteins account for 80%. 
Since it seems likely that the pH 4-8 gels give a representative 
sampling of all proteins, we estimate thai half of the total 
cellular protein is accounted for by fewer than 100 different 
gene products, principally glycolytic enzymes and proteins in- 
volved in protein synthesis. 

Correlation of protein abundance with mRNA abundance. 
Estimates of mRNA abundance for each gene have been made 
by SAGE (27) and by hybridization of cRNA to oligonucleo- 
tide arrays (30). These two methods give broadly similar re- 
sults, yet each method has strengths and weaknesses (Materials 
and Methods). Table 1 lists the number of molecules of mRNA 
per cell for each gene studied. One measurement (mRNA) 
uses data from SAGE analysis alone (27); a second incorpo- 
rates data from both SAGE and hybridization (30) (adjusted 
mRNA) (Table I; Materials and Methods). We correlated 
protein abundance with mRNA abundance (Fig. 2). For ad- 



justed mRNA versus protein, the Spearman rank correlation 
coefficient, r x , was 0.74 (P < 0.0001), and the Pearson corre- 
lation coefficient, r p , on log transformed data (Materials and 
Methods) was 0.76 (P < 0.00001). We obtained similar corre- 
lations for mRNA versus protein and also for other data trans- 
formations (Materials and Methods). Thus, several statistical 
methods show a strong and significant correlation between 
mRNA abundance and protein abundance. Of course, the cor- 
relation is far from perfect; for mRNAs of a given abundance, 
there is at least a 10-fold range of protein abundance (Fig. 2). 
Some of this scatter is probably due to posttranscriptional 
regulation, and some is due to errors in the mRNA or protein 
data. For example, the protein Yef3 runs poorly on our gels, 
giving multiple smeared spots. Its abundance has probably 
been underestimated, partly explaining the low protein/mRNA 
ratio of Yef3. It is the most extreme outlier in Fig. 2. 

These data on mRNA (27, 30) and protein abundance (Ta- 
ble 1) suggest that for each mRNA molecule, there are on 
average 4,000 molecules of the cognate protein. For instance, 
for Actl (actin) there are about 54 molecules of mRNA per 
cell and about 205.000 molecules of protein. Assuming an 
mRNA half-life of 30 min (12) and a cell doubling time of 120 
min. this suggests that an individual molecule of mRNA might ^ 
be translated roughly 1,000 times. These calculations are lim- | 
iled to mRNAs for abundant proteins, which are likely to be g 
the mRNAs that are translated best. g- 

A full complement of cell protein is synthesized in about 120 2; 
min under these conditions. Thus, 4,000 molecules of protein | 
per molecule of mRNA implies that translation initiates on an g 
mRNA about once every 2 s. This is a remarkably high rale; it g. 
implies that if an average mRNA bears 10 ribosomes engaged j» 
in translation, then each ribosome completes translation in 3 
20 s; if an average protein has 450 residues; this in turn implies J 
translation of over 20 amino acids per s, a rale considerably cr 
higher than estimated for mammalians (3 to 8 amino acids per o 
s) (18). These estimates depend on the amount of mRNA per ^ 
cell (11, 27). ' S 

The large number of protein molecules that can be made =j 
from a single mRNA raises the issue of how abundance is ^ 
controlled for less abundant proteins. Many nonabundant pro- ^ 
teins may be unstable, and this would reduce the protein/ ro 
mRNA ratio. In addition, many nonabundant proteins may be g 
translated at suboptimal rates. We have found that mRNAs for 
nonabundant proteins usually have suboptimal contexts for 
translational initiation. For example, there are over 600 yeast 
genes which probably have short open reading frames in the 
mRNA upstream of the main open reading frame (1.7a). These 
may be devices for reducing the amount of protein made from 
a molecule of mRNA. 

Correlation of codon bias with protein abundance. The 
mRNAs For highly expressed proteins preferentially use some 
codons rather than others specifying the same amino acid (14). 
This preference is called codon bias. The codons preferred are 
those for which the tRNAs are present in the greatest amounts. 
Use of these codons may make translation faster or more 
efficient and may decrease misincoiporation. These effects are 
most important for the cell for abundant proteins, and so 
codon bias is most extreme for abundant proteins. The effect 
can be dramatic — highly biased mRNAs may use only 25 of the 
61 codons. 

We asked whether the correlation of codon bias with abun- 
dance continues for medium-abundance proteins. There are 
various mathematical expressions quantifying codon bias; here, 
we have used the CAI (24) (Materials and Methods) because 
it gives a result between 0 and 1. The r s for CAI versus protein 
abundance is 0.80 (P < 0.0001), similar to the mRNA-protein 
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Carbohydrate metabolism 



Protein synthesis 



Heat shock 



Amino acid synthesis 



Miscellaneous 



" CAI, a measure of cudun bias, is laken from the YHD. mRNA, number of mRNA molecules per cell from SAGE data (27); adjusted mRNA, number of mRNA 
molecules per cell based on both SAGE and chip hybridization (30) (see Materials and Methods); Protein (Glu), number of molecules of protein per cell in 
YNB-glucose; Protein (Eth). number of molecules of protein per cell in YNB-ethanol; E/G ratio, ratio of protein abundance in cthanol to glucose. The E/G ratio is 
not given if it was close to 1 or if it was not repeatable (NR) in multiple gels. Some gene products (e.g., Tifl and Tif2 [Tifl.2|) were difficult to distinguish on either 
a protein or an mRNA basis; these are pooled. No Nla, there was no suitable i\ l la\\\ site in the 3' region of the gene, and so there arc no SAGE mRNA data: No Mel, 
the mature gene product contains no methionines, and so there are no reliable protein data. 
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TABLE 2. Functions of proteins listed in Table 



Name" 



YPD title lines* 



Alcohol dehydrogenase 1; cytoplasmic isozyme reducing acetaldehyde to ethanol, regenerating NAD + 
Alcohol dehydrogenase II; oxidizes ethanol to acetaldehyde, glucose repressed 

Citrate synthase, peroxisomal (nonmitochondrial); converts acetyl-CoA and oxaloacetate to citrate plus CoA 
Enolase 1 (2-phosphoglycerate dehydratase); converts 2-phospho-D-glycerate to phosphoenolpyruvate in glycolysis 
Enolase 2 (2-phosphoglycerate dehydratase); converts 2-phospho-D-glycerate to phosphoenolpyruvate in glycolysis 
Fructose bispnospnate aldolase II; sixth step in glycolysis 

Hexokinase 1; converts hexoses to hexose phosphates in glycolysis; repressed by glucose 

Hexokinase II; converts hexoses to hexose phosphates in glycolysis and plays a regulatory role in glucose repression 
Isocitrate lyase, peroxisomal; carries out part ot the glyoxylate cycle; required for gluconeogenesis 
Pyruvate dehydrogenase complex. El beta subunil 
Pyruvate decarboxylase isozyme 1 

Phosphofructokinase alpha subunit, part of a complex with Pfk2p which carries out a key regulatory step in glycolysis 
Glucose-6-phosphate isomerase, converts glucose-6-phosphate to frucjose-6-phosphate 
Pyruvate carboxylase 1; converts pyruvate to oxaloacetate for gluconeogenesis 
Transaldolase; component of nonoxidative part of pentose phosphate pathway 

Glyceraldehyde-3-pnosphate dehydrogenase 2; converts D-glyceraldehyde 3-phosphate to 1,3-dephosphoglycerate 
Glyceraldehyde-3-phosphate dehydrogenase 3; converts D-glyceraldehyde 3-phosphate to 1,3-dephosphoglycerate 
Tnosephosphate isomerase: interconverts glyceraklehyde-3-phosphate and dihydroxyacetone phosphate 

Translation elongation factor EF-10: GDP/GTP exchange factor for Teflp/Tef2p 

Translation elongation factor EF-2: contains diphthamide which is not essential for activity; identical to Eft2p 
Translation elongation factor EF-2; contains diphthamide which is not essential for activity; identical to Eftlp 
Translation initiation factor eIF3 beta subunil (p90); has an RNA recognition domain 
Acidic ribosomal protein AO 

Translation initiation factor 4A (elF4A) of the DEAD box family 
Translation initiation factor 4A (eIF4A) of the DEAD box family 
Translation elongation factor EF-3A; member of ATP-binding cassette superfamily 

Chaperonin homologous to E. coli HtpG and mammalian HSP90 

Mitochondrial chaperonin that cooperates with HsplOp; homolog of E. coli GroEL 

Heal-inducible chaperonin homologous to /:. coli HtpG and mammalian HSP90 

Heat shock protein required for induced thermotolerance and for resolubilizing aggregates of denatured proteins; important for [psi~|- 
lo-|PSr | prion conversion 

Heal shock protein of the endoplasmic reticulum lumen required for protein translocation across the endoplasmic reticulum membrane 

and for nuclear fusion; member of the HSP70 family 
Cytoplasmic chaperone; heat shock protein of the HSP70 family 
Cytoplasmic chaperone; member of the HSP70 family 
Heat shock protein of HSP70 family involved in the translational apparatus 
Heat shock protein of HSP70 family, cytoplasmic 

Mitochondrial protein that acts as an import motor with Tim44p and plays a chaperonin role in receiving and folding of protein chains 

during import; heat shock protein of HSP70 family 
Heal shock protein of the HSP70 family; multicopy suppressor of mutants with hyperactivated Ras/cyclic AMP pathway 
Stress-induced protein required for optimal growth at high and low temperature; has tetratricopeptide repeats 

Phosphoribosylamidoimidazole-succinocarboxamide synthase: catalyzes the seventh step in de novo purine biosynthesis pathway 
C, tetrahydrofolate synthase (trifunctiona! enzyme), cytoplasmic 

Phosphoribosylamine-glycine ligase plus phosphoribosylformylglycinamidine cyclo-ligase; Afunctional protein 
Argininosuccinate lyase^ catalyzes the final step in arginine biosynthesis 

Glutamate dehydrogenase (NADP T ); combines ammonia and «-ketogluiarate to form glutamate 
Glutamine synthetase; combines ammonia to glutamate in ATP-driven reaction 

Phosphoribosvl-AMP cyclohydrolase/phosphoribosyl-ATP pyrophosphohydrolase/histidinol dehydrogenase; 2nd. 3rd, and 10th steps of 

his biosynthesis pathway * * 
Ketol-acid reductotsomerase (ace to hydroxy, acid reductoisomerase) (alpha-keto-p-hydroxylacyl) reductoisomerase); second step in Val 

and llv biosynthesis pathway 

Saccharopine dehydrogenase (NADP"\ L-glulamate forming) (saccharopine reductase), seventh step in lysine biosynthesis pathway 
Homocysteine methyltransferase; (5-methyltctrahydropteroyl triglutamate-homocysteine methyltransferase), methionine synthase, 
cobalamin independent 

7-Glutamyl phosphate reductase (phosphoglutamate dehydrogenase), proline biosynthetic enzyme 
Phosphoserine transaminase; involved in synthesis of serine from 3-phosphoglycerate 
Tr>'ptophan synthase, last (5th) step in tryptophan biosynthesis pathway 

Act in; involved in cell polarization, endocytosis, and other cytoskeletal functions 
Adenylate kinase (GTP:AMP phosphotransferase), cytoplasmic 
Cytosolic acetaldehyde dehydrogenase 

Mela subunit of Fl-ATP synthase; 3 copies are found in each F1 oligomer 
Homolog of mammalian 1 4-3-3 protein; has strong similarity to Bmh2p 
Homolog of mammalian 14-3-3 protein; has strong similarity to Bmhlp 

Protein of the AAA family of ATPases; required for cell division and homotypic membrane fusion 
Leucyl-tRNA synthetase, cytoplasmic 

Farnesyl pyrophosphate synthetase; may be rate-limiting step in sterol biosynthesis pathway 
Di.-Glycerol phosphaie phosphatase 

Ran, a GTP-binding protein of the Ras superfamily involved in trafficking through nuclear pores 
Inorganic pyrophosphatase, cytoplasmic 

Component of serine C-palmitoyltransferase; first step in biosynthesis of long-chain base component of sphingolipids 
Thiamine-repressed protein essential for growth in the absence of thiamine 

Poly(A)-binding protein of cytoplasm and nucleus; part of the 3'-end RNA-processing complex (cleavage factor I); has 4 RNA 

recognition domains 
Mannose-1 -phosphate guanyllransferase; GDP-mannosc pyrophosphorylase 
Ribonucleotide reductase small subunit 
S-Adenosylmethionine synthetase 1 
S-Adenosylmethionine synthetase 2 
Copper-zinc superoxide dismutasc 
Ubiquilin-activaling (El) enzyme 
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correlation, confirming a strong correlation between CAI and 
protein abundance (Fig. 3). The relationship between CAI and 
protein abundance is log linear from about 1,000.000 to about 
10,000 molecules per cell. We have no data for rarer proteins. 

It is not clear whether CAI reflects maximum or average 
levels of protein expression. The proteins used for the CAI- 
protein correlation included some proteins which were not 
expressed at maximum levels under the condition of the ex- 
periment (Hsc82, Hspl04, Ssal, Adel, Arg4, His4, and others). 
When these proteins were removed from consideration and 
the correlation between CAI and the remaining (presumably 
constitutive) proteins was recalculated, the r s was essentially 
unchanged (not shown). 

The equation describing the graph in Fig. 3 is log (protein 
molecules/cell) = (2.3 x CAI) + 3.7. Thus, under certain 
conditions (a CAI of 0.3 or greater; a constitutively expressed 
gene), a very rough estimate of protein abundance can be 
made by raising 10 to the power of ['(2.3 x CAI) + 3.7]. 

The distribution of CAI over the genome (Fig. 4) consists of 
a lower, bell-shaped distribution, possibly indicating a region 
where there is no selection for codon bias, and an upper, flat 
distribution, starting at a CAI of about 0.3, possibly indicating 
a region where there is selection for codon bias. Almost all of 
the proteins whose abundance we have measured are in the 
upper, flat portion of the distribution. In the lower, bell-shaped 
region, we do not know whether there is a correlation between 
CAI and protein abundance. 

Changes in protein abundance in glucose and cthanol. A 
comparison of cells grown in glucose (Fig. I A) with cells grown 
in ethanol (Fig. IB) is shown in Table 1. As is well known, 
some proteins are induced tremendously during growth on 
ethanol. Two striking examples are the peroxisomal enzymes 
Icll (isocitrate lyase) and Cil2 (citrate synthase), which are 
induced in ethanol by more than 100- and 12-fold, respectively 
(Fig. I; Table I). These enzymes are key components of the 
glyoxylate shunt, which diverts some acetyl coenzyme A 
(acetyl-CoA) from the tricarboxylic acid cycle to gluconeogen- 
esis. 5. cerevisiae requires large amounts of carbohydrate for its 
cell wall; in ethanol medium, this carbohydrate comes from 
gluconcogenesis, which depends on the glyoxylate shunt and 
on the glycolytic pathway running in reverse. The need for 



gluconeogenesis also explains why glycolytic enzymes are 
abundant even in ethanol medium. Thus, 2D gel analysis shows 
the prominence of the glycolytic and glyoxylate shunt enzymes 
in cells grown on ethanol. emphasizing that gluconeogenesis, 
presumably largely for production of the cell wall, is a major 
metabolic activity under these conditions. 

During gluconeogenesis, substrate-product relationships are 
reversed for the glycolytic enzymes. One might expect that not 
all glycolytic enzymes would be well adapted to the reverse 
reaction. Indeed, 2D gels show that in ethanol, Adh2 (alcohol 
dehydrogenase 2) is strongly induced (16), while its isozyme 
Adhl is not greatly affected. Ad h i and Adh2 each interconvert 
acetaldehyde and ethanol. Adhl has a relatively high K m for 
ethanol (17 mM), while Adh2 has a lower K m (0.8 mM) (5). 
Thus, it is thought that Adhl is specialized for glycolysis (ac- 
etaldehyde to ethanol), while Adh2 is specialized for respira- 
tion (ethanol to acetaldehyde) (5, 29). Similarly, Enot (enolase 
1) is induced in ethanol, while its isozyme Eno2 (enolase 2) 
decreases in abundance (Table I) (4, 19). Enol is inhibited by 
2-phosphoglycerate (the glycolytic substrate), while Eno2 is 
inhibited by phosphoenolpyruvate (the gluconeogenic sub- 
strate) (4). Perhaps Enol has a lower K m for phosphoenol- 
pyruvate than does Eno2, though to our knowledge this has not 
been tested. Thus, the 2D gels distinguish isozymes specialized 
for growth on glucose (Adhl and Eno2) from isozymes spe- 
cialized for ethanol (Adh2 and Enol). 

Many heat shock proteins (e.g., Hsp60, Hsp82, Hsp.104, and 
Kar2) were about twofold more abundant in ethanol medium 
than in glucose medium. This is consistent with the increased 
heat resistance of cells grown in ethanol (3). 

Enzymes involved in protein synthesis (Eft I, RpaO, and Tifl) 
were about twice as abundant in glucose medium as in ethanol 
medium. This may reflect the higher growth rate of the cells in 
glucose. 

Phosphorylation of proteins. To examine protein phosphor- 
ylation, we labeled cells with 32 P and ran 2D gels to examine 
phosphoproteins. About 300 distinct spots, probably represent- 
ing 150 to 200 proteins, could be seen on pH 4-8 gels (Fig. 5B). 
We then aligned autoradiograms of three gels, each with a 
different kind of labeled protein ( 32 P only [Fig. 5B], 32 P plus 
35 S [Fig. 5AJ, and 35 S only [not shown, but see Fig. 1 for 
example]). In this way, we made provisional identification of 



1 o 



0 5 
~ 1 0° -I 



o 
rx 



10" 



O CD 

o°oaS 0 ° ° 



— i 1 1 1 

0.2 0.4 0.6 0.8 



Codon Adaptation Index 

FIG. 3. Correlation of protein abundance with CAI. The number of mole- 
cules per cell of each protein is plotted against the CAI for that protein. Note the 
logarithmic scale on the protein axis. Data for the CAI are from the YPD 
database (13). 



7364 FUTCHER ET AL. 



Mol. Cell. Biol. 



2000 - 



C 

Q) 

o 



0) 
-Q 

E 

3 



1000 



0 



l! 



.12 



.24 



.36 



.48 



.60 



.72 



.84 



Codon Adaptation Index 

FIG. A. Distribution of CAI over the whole genome, shown in intervals of 0.030 (i.e., there are 150 genes with a CAI between 0.000 and 0.030, inclusive; 31 genes 
with a CAI between 0.031 and 0.060; 269 genes with a CAI between 0.061 and 0.090; 1,296 genes with a CAI between 0.091 and 0.120; etc.). The distribution peaks 
with 2.028 genes with a CAI between 0.121 and 0.150. 



some of the " ,2 P-labelecl spots as particular 3:> S-Iabclcd spots. 
All such identifications are somewhat uncertain, since precise 
alignments are difficult, and of course multiple spots may ex- 
actly comigrate. Nevertheless, we believe that most of the 
provisional identifications are probably correct. Among the 
major 32 P- labeled proteins are the hexokinases Hxkl and 
Hxk2, the acidic ribosome-associated protein RpaO, the trans- 
lation factors Yef3 and Efbl. and probably Hsp70 heat shock 
proteins of the Ssa and Ssb families. RpaO and Efbl are quan- 
titatively monophosphorylated. 

Many yeast proteins resolve into multiple spots on these 2D 
gels (7). YeG has five or more spots, at least four of which 
comigrate with 32 P. Tpil has a major spot showing no 32 P 
labeling and a minor, more acidic spot which overlaps with 
some 32 P label. TifJ has at least seven spots (7); two of these 
overlap with some 32 P label, but five do not (Fig. 5). Eftl has 
at least three spots (7), and none of these overlap with 32 P, 
although there are three nearby, unidentified 32 P-labeled spots 
(a, c, and d in Fig. 5). Spots that seem to be extra forms of 
Mel6, Pdcl , Eno2, and Fbal can be seen in Fig. 6A, but there 
is little 32 P at these positions in Fig. 5. Thus, phosphorylation 
explains some but not all of the different protein isoforms seen. 

The cell cycle is regulated in part by phosphorylation. We 
compared 3 2 P- labeled proteins from cells synchronized in G, 
with a-factor, in cells synchronized in G, by depletion of Gj 
cyclins, and in cells synchronized in M phase with nocodazole. 
Only very minor differences were seen, and these were difficult 
to reproduce. The cell cycle proteins regulated by phosphory- 
lation may not be abundant enough for this technique to be 
applied easily. 

Centrifugal fractionation. We fractionated 35 S-labeled ex- 
tracts by centrifugalion (Materials and Methods). Figure 6A 
shows the proteins in the supernatant of a high-speed 
(100,000 X g, 30 min) centrifugalion, while Fig. 6B shows the 
proteins in the pellet of a low-speed (16,000 x g, 10 min) 
centrifugalion. Many proteins are tremendously enriched in 
one fraction or the other, while others are present in both. 



Most glycolytic enzymes (e.g., Tdh2, Tdh3, Eno2, Pdcl, Adhl, 
and Fbal) are enriched in the supernatant fraction. The only 
exception is Pfkl (not indicated), which is found in both pellet 
and supernatant fractions. Many proteins involved in protein 
synthesis (Eftl, Yef3, Pnl, Till, and RpaO) are in the pellet, 
possibly because of the association of ribosomes with the en- 
doplasmic reticulum. However, Efbl is in the supernatant, as is 
a substantial portion of the Eftl. Perhaps surprisingly, several 
mitochondrial proteins (Atp2 [not shown] and Ilv5) are largely 
in the supernatant. Perhaps glass bead breakage of cells re- 
leases mitochondrial proteins. The nuclear protein Gspl is in 
the pellet fraction. The enrichment produced by centrifugalion 
makes it possible to see minor spots which are otherwise poorly 
resolved from surrounding proteins. Figure 6B shows that the 
previously identified Till spot is surrounded by as many as six 
other spots that cofractionate. We observed six identical or 
very similar additional spots when we overexpressed Tifl from 
a high-copy-number plasmid (not shown). Signal overlaps only 
one or two of these spots in 32 P-labeling experiments (Fig. 5), 
and so the different forms are not mainly due to different 
phosphorylation states. 

DISCUSSION 

Our experience with developing a 2D gel protein database 
for S. cerevisiae is summarized here. With current technology, 
we can see the most abundant 1,200 proteins, which is about 
one-third to one-quarter of the proteins expressed. The re- 
maining proteins will be difficult to see and study with the 
methods that we have used, not because of a lack of sensitivity 
but because weak spots are covered by nearby strong spots. 

Of the 1,200 proteins seen, we have identified 148, with a 
bias toward the most abundant proteins. Steady application of 
the methods already used would allow identification of most of 
the remaining proteins. Gene overexpression will be particu- 
larly useful, since it is not affected by the lower abundance of 
the remaining visible proteins. 
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2D gels of the kind that we have used are not suitable for 
visualization of rare proteins. However it will be possible to 
study on a global basis metabolic processes involving relatively 
abundant proteins, such as protein synthesis, glycolysis, glu- 
concogcnesis, amino acid synthesis, cell wall synthesis, nucle- 
otide synthesis, lipid metabolism, and the heat shock response. 

Gygi el al. (10) have recently completed a study similar to 
ours. Despite generating broadly similar data, Gygi et al. 
reached markedly different conclusions. We believe that both 
mRNA abundance and codon bias are useful predictors of 
protein abundance. However, Gygi el al. feel that mRNA 
abundance is a poor predictor of protein abundance and that 
"codon bias is not a predictor of either protein or mRNA 
levels" (10). These different conclusions are partly a matter of 
viewpoint. Gygi et al. focus on the fact that the correlations of 
mRNA and codon bias with protein abundance are far from 
perfect, while we focus on the fact that, considering the wide 
range of mRNA and protein abundance and the undoubted 
presence of other mechanisms affecting protein abundance, 
the correlations are quite good. 

However, the different conclusions are also partly due to 
different methods of statistical analysis and to real differences 
in data. With respect to statistics, Gygi et al. used the Pearson 
product-moment correlation coefficient (r p ) to measure the 
covariance of mRNA and protein abundance. Depending on 
the subset of data included, their r p values ranged from 0.1 to 
0.94. Because of the low r values with some subsets of the 
data, Gygi et al. concluded that the correlation of mRNA to 
protein was poor. However, the r correlation is a parametric 
statistic and so requires variates following a bivariale normal 
distribution; that is, it would be valid only if both mRNA and 
protein abundances were normally distributed. In fact, both 
distributions are very far from normal (data not shown), and so 
a calculation of r p is inappropriate. There was no statistical 
backing for the assertion that codon bias fails to predict pro- 
tein abundance. 

We have taken two statistical approaches. First, we have 
used the Spearman rank correlation coefficient (/;.). Since this 
statistic is nonparametric, there is no requirement for the data 
to be normally distributed. Using the r v . we find that mRNA 
abundance is well correlated with protein abundance (r x = 
0.74), and the CAI is also well correlated with protein abun- 
dance (r s - 0.80) (and also with mRNA abundance [data not 
shown]). For the data of Gygi et al. (10), we obtained similar 
results, though with their data the correlation is not as good; r s 
= 0.59 for the mRNA-to-protein correlation, and r s = 0.59 for 
the codon bias-to-protein correlation. 

In a second approach, we transformed the mRNA and pro- 
tein data to forms where they were normally distributed, to 
allow calculation of an r p (Materials and Methods). Two trans- 
formations, Box-Cox and logarithmic, were used; both gave 
good correlations with our data [e.g., r = 0.76 for log(adjusted 
RNA) to log(protein)]. We were not able to transform the data 
of Gygi et al. to a normal distribution. 

Finally, there are also some differences in data between the 
two studies. These may be partly due to the different measure- 
ment techniques used: Gygi et al. measured protein abundance 
by cutting spots out of gels and measuring the radioactivity in 
each spot by scintillation counting, whereas we used phospho- 
rimaging of intact gels coupled to image analysis. We com- 
pared our data to theirs for the proteins common between the 
studies (but excluding proteins whose mRNAs are known to 
differ between rich and minimal media, and excluding Tifl, 
which was anomalous in differing by 100-fold between the two 
data sets). The r s between the two protein data sets was 0.88 
(P < 0.0001 ). Although this is a strong correlation, the fact that 



it is less than 1.0 suggests that there may have been errors in 
measuring protein abundance in one or both studies. After 
normalizing the two data sets to assume the same amount of 
protein per cell, we found a systematic tendency for the protein 
abundance data of Gygi et al. to be slightly higher than ours for 
the highest-abundance proteins and also for the lowest-abun- 
dance proteins but slightly lower than ours for the middle- 
abundance proteins. These systematic differences suggest some 
systematic errors in protein measurement. Although we do not 
know what the errors are, we suggest the following as a rea- 
sonable speculation. For the highest-abundance proteins, we 
may have underestimated the amount of protein because of a 
slightly nonlinear response of the phosphorimager screens. For 
the lowest-abundance proteins, Gygi et al. may have overesti- 
mated the amount of protein because of difficulties in accu- 
rately cutting very small spots out of the gel and because of 
difficulties in background subtraction for these small, weak 
spots. The difference in the middle abundance proteins may be 
a consequence of normalization, given the two errors above. 

The low-abundance proteins in the data set of Gygi et al. 
have a poor correlation with mRNA abundance. We calculate 
that the r s is 0.74 for the top 54 proteins of Gygi et al. but only 
0.22 for the bottom 53 proteins, a statistically significant dif- o 
ference. However, with our data set, the r s is 0.62 for the top 33 2. 
proteins and 0.56 (not significantly diiTerent) for the bottom 33 8 
proteins (which are comparable in abundance to the bottom £ 
53 proteins of Gygi et al.). Thus, our data set maintains a good 
correlation between mRNA and protein abundance even at 3 
low protein abundance. This is consistent with our speculation | 
that protein quantification by phosphorimaging and image F" 
analysis may be more accurate for small, weak spots than is ^ 
cutting out spots followed by scintillation counting. Our rela- 0 
tively good correlations even for nonabundant proteins may *3 
also reflect the fact that we used both SAGE data and RNA 
hybridization data, which is most helpful for the least abundant § 
mRNAs. In summary, we feel that the poor correlation of ^ 
protein to mRNA for the nonabundant proteins of Gygi et al. 8 
may reflect difficulty in accurately measuring these nonabun- i- 
dant proteins and mRNAs, rather than indicating a truly poor " 
correlation in vivo. It is not surprising that observed correla- ^ 
tions would be poorer with less-abundant proteins and § 
mRNAs, simply because the accuracy of measurement would 00 
be worse. 

How well can mRNA abundance predict protein abun- 
dance? With r p = 0.76 for logarithmically transformed mRNA 
and protein data, the coefficient of determination, (r /7 )\ is 0.58. 
This means that more than half (in log space) of the variation 
in protein abundance is explained by variation in mRNA abun- 
dance. When converted back to arithmetic values, protein 
abundances vary over about 200-fold (Table 1), and (r p ) 2 ~ 
0.58 for the log data means that of this 200-fold variation, 
about 20-fold is explained by variation in the abundance of 
mRNA and about 10-fold is unexplained (but could be due 
partly to measurement errors). For proteins much less abun- 
dant than those considered here, we imagine the in vivo cor- 
relation between mRNA and protein abundance will be worse, 
and other regulatory mechanisms such as protein turnover will 
be more important. 

Some important conclusions can be drawn from this sam- 
pling of the proteome. First, there is an enormous range of 
protein abundance, from nearly 2,000,000 molecules per cell 
for some glycolytic enzymes to about 100 per cell for some cell 
cycle proteins (26a). Second, about half of all cellular protein 
is found in fewer than 100 different gene products, which are 
mostly involved in carbohydrate metabolism or protein synthe- 
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sis. Third, the correlation between protein abundance and CAI 
is log linear as far as we can see, which is from about 10,000 
protein molecules per cell to about 1 ,000,000. This is somewhat 
surprising, because it implies that selective forces for codon 
bias are significant even at moderate expression levels. It also 
means that codon bias is a useful predictor of protein abun- 
dance even for moderately low bias proteins. Fourth, there is a 
good correlation between protein abundance and mRNA 
abundance for the proteins that we have studied. This validates 
the use of mRNA abundance as a rough predictor of protein 
abundance, at least for relatively abundant proteins. Fifth, for 
these abundant proteins, there are about 4,000 molecules of 
protein for each molecule of mRNA. This last conclusion 
raises questions as to how the levels of nonabundant proteins 
are regulated and suggests that protein instability, regulated 
translation, suboptimal rates of translation, and other mecha- 
nisms in addition to transcriptional control may be very impor- 
tant for these proteins. 
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